Ozgur Ozturk

Ozgur Ozturk

Roswell, Georgia, United States
8K followers 500+ connections

About

Data Scientist with experience in
• Machine Learning (scikit-learn),
• Natural…

Activity

Join now to see all activity

Experience

  • Kaiser Permanente Graphic

    Kaiser Permanente

    Atlanta Metropolitan Area

  • -

    Baltimore County, Maryland, United States

  • -

    Greater Atlanta Area

  • -

    Dallas/Fort Worth Area

  • -

    Greater Atlanta Area

  • -

    Alpharetta, GA

  • -

    Greater Atlanta Area

  • -

    Lawrenceville, GA

  • -

    Greater Atlanta Area

  • -

    Greater Atlanta Area

  • -

    Greater Atlanta Area

  • -

    Atlanta, GA

  • -

  • -

  • -

  • -

  • -

    Columbus, Ohio Area

  • -

  • -

  • -

Education

Licenses & Certifications

Volunteer Experience

  • Teaching JavaScript, TypeScript

    Local Meetups, Afterschool Programs, Conferences

    - Present 10 years 2 months

    Education

    - Thought JavaScript to Eighth graders, in their weekly afterschool club for a semester.
    - Volunteered to present "Introduction to TypeScript" in Atlanta Code Camp and Atlanta meetups of user groups.

  • Manning Publications Co. Graphic

    Technical Reviewer

    Manning Publications Co.

    - Present 11 years 4 months

    Education

    Reviewed
    - Voice Applications for Alexa and Google Assistant
    - Torch in Action
    - Web Performance in Action
    - Sails.js in Action
    - Responsive WordPress Theming

Publications

  • Vector Space Indexing for Biosequence Similarity Searches

    Int’l Journal on Artificial Intelligence Tools

    We present a multi-dimensional indexing approach for fast sequence similarity search in DNA and protein databases. In particular, we propose effective transformations of subsequences into numerical vector domains and build efficient index structures on the transformed vectors. We then define distance functions in the transformed domain and examine properties of these functions. We experimentally compared their (a) approximation quality for k-Nearest Neighbor (k-NN) queries and both (b) pruning…

    We present a multi-dimensional indexing approach for fast sequence similarity search in DNA and protein databases. In particular, we propose effective transformations of subsequences into numerical vector domains and build efficient index structures on the transformed vectors. We then define distance functions in the transformed domain and examine properties of these functions. We experimentally compared their (a) approximation quality for k-Nearest Neighbor (k-NN) queries and both (b) pruning ability and (c) approximation quality for ε-range queries. Results for k-NN queries, which we present here, show that our proposed distances FD2 and WD2 (i.e. Frequency and Wavelet Distance functions for 2-grams) perform significantly better than the others. We then develop effective index structures, based on R-trees and scalar quantization, on top of transformed vectors and distance functions. Promising results from the experiments on real biosequence data sets are presented.

    Other authors
    • H. Ferhatosmanoglu
    See publication
  • CoMRI: A Compressed Multi-Resolution Index Structure for Sequence Similarity Queries

    IEEE Computer Society Bioinformatics Conference (CSB '03)

    In this paper, we present CoMRI, compressed multiresolution index, our system for fast sequence similarity search in DNA sequence databases. We employ virtual bounding rectangle (VBR) concept to build a compressed, grid style index structure. An advantage of grid format over trees is subsequence location information is given by the order of corresponding VBR in the VBR list. Taking advantage of VBRs, our index structure fits into a reasonable size of memory easily. Together with a new optimized…

    In this paper, we present CoMRI, compressed multiresolution index, our system for fast sequence similarity search in DNA sequence databases. We employ virtual bounding rectangle (VBR) concept to build a compressed, grid style index structure. An advantage of grid format over trees is subsequence location information is given by the order of corresponding VBR in the VBR list. Taking advantage of VBRs, our index structure fits into a reasonable size of memory easily. Together with a new optimized multiresolution search algorithm, the query speed is improved significantly. Extensive performance evaluations on human chromosome sequence data show that VBRs save 80%-93% index storage size compared to MBRs (minimum bounding rectangles) and new search algorithm prunes almost all unnecessary VBRs which guarantees efficient disk I/O and CPU cost. According to the results of our experiments, the performance of CoMRI is at least 100 times faster than MRS which is another grid index structure introduced very recently.

    Other authors
    • H. Ferhatosmanoglu
    • H. Sun
    See publication
  • Effective Indexing and Filtering for Similarity Search in Large Biosequence Databases

    IEEE International Symposium on Bioinformatics and Bioengineering (BIBE '03)

    We present a multi-dimensional indexing approach for fast sequence similarity search in DNA and protein databases. In particular, we propose effective transformations of subsequences into numerical vector domains and build efficient index structures on the transformed vectors. We then define distance functions in the transformed domain and examine properties of these functions. We experimentally compared their (a) approximation quality for k-Nearest Neighbor (k-NN) queries, (b) pruning ability…

    We present a multi-dimensional indexing approach for fast sequence similarity search in DNA and protein databases. In particular, we propose effective transformations of subsequences into numerical vector domains and build efficient index structures on the transformed vectors. We then define distance functions in the transformed domain and examine properties of these functions. We experimentally compared their (a) approximation quality for k-Nearest Neighbor (k-NN) queries, (b) pruning ability and (c) approximation quality for E-range queries. Results for k-NN queries, which we present here, show that our proposed distances FD2 and WD2 (i.e. Frequency and Wavelet Distance functions for 2-grams) perform significantly better than the others. We then develop effective index structures, based on R-trees and scalar quantization, on top of transformed vectors and distance functions. Promising results from the experiments on real biosequence data sets are presented.

    Other authors
    • H. Ferhatosmanoglu
    See publication

Languages

  • English

    -

  • Turkish

    -

  • Beginner German

    -

Organizations

  • Southern Data Science Conference

    Program Committee Member

    - Present

Recommendations received

More activity by Ozgur

View Ozgur’s full profile

  • See who you know in common
  • Get introduced
  • Contact Ozgur directly
Join to view full profile

Other similar profiles

Explore collaborative articles

We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.

Explore More

Others named Ozgur Ozturk in United States

Add new skills with these courses