About
I use code, math, and data to solve business problems. I love what I do, and I run my own…
Activity
-
Leigh Parker is an amazing leader to be lead by and Federal Reserve is a wonderful organization to work for! Apply if interested!
Leigh Parker is an amazing leader to be lead by and Federal Reserve is a wonderful organization to work for! Apply if interested!
Liked by Vishal Patel
-
Excited for my side project -- a fun and frivolous app. All the advice, guidance and strategic direction I have given clients and co-workers on what…
Excited for my side project -- a fun and frivolous app. All the advice, guidance and strategic direction I have given clients and co-workers on what…
Liked by Vishal Patel
-
I’m excited to announce that I've begun a new role as Chief Technology Officer at Credytu! With Alexandra Villarreal O'Rourke, we're launching with…
I’m excited to announce that I've begun a new role as Chief Technology Officer at Credytu! With Alexandra Villarreal O'Rourke, we're launching with…
Liked by Vishal Patel
Experience
Education
Publications
-
Exploring the Data Science Process
Data Intelligence Conference
Abstract: The entire data science process can be organized into multiple steps/phases, and it is helpful to establish a standardized workflow for team members to collaborate effectively and generate valuable results. In this presentation, I will provide a detailed walk-through of seven phases of the data science process. The following seven phases of the data science process were discussed: (1) Business Understanding, (2) Data Preparation, (3) Data Munging, (4) Model Building, (5) Model…
Abstract: The entire data science process can be organized into multiple steps/phases, and it is helpful to establish a standardized workflow for team members to collaborate effectively and generate valuable results. In this presentation, I will provide a detailed walk-through of seven phases of the data science process. The following seven phases of the data science process were discussed: (1) Business Understanding, (2) Data Preparation, (3) Data Munging, (4) Model Building, (5) Model Evaluation, (6) Model Deployment, and (7) Model Tracking. I will emphasize certain parts of the process that are specifically relevant and interesting to machine learning practitioners. I will not be discussing any specific machine learning techniques or the hottest new tool in market, but we will explore the data science process from a bird’s eye view. We will use some examples, take occasional detours, and dig deeper into some interesting areas to better understand how the different pieces of the data science puzzle fit together. The objective of this presentation is to introduce various steps/phases of the data science process that will help think about data science more systematically.
The slides from this presentation can be found here: https://www.slideshare.net/VishalPatel321/exploring-the-data-science-process -
A Practical Guide to Dimensionality Reduction Techniques
PyData DC 2016
In this presentation, I focus on several dimensionality reduction techniques pertaining to the pre-processing of data for supervised learning. A large data-set was used to demonstrate these techniques by following a dimensionality reduction work-flow. The objective of this presentation was to introduce several dimensionality reduction techniques, demonstrate how to implement them (using Python), and assess their efficacy as it relates to supervised learning problems.
-
Automatic Elbow Detection for Feature Reduction
Open Data Science Conference (ODSC) East 2016
Most machine learning projects involve datasets with high dimensionality. Multiple feature reduction techniques have been proposed and used by statisticians and data scientists over the years. One of those techniques involve a visual inspection of an ordered list of features, where the order is determined based on a statistical measure such as absolute correlation with the target (dependent) feature. The plot of such ordered list typically looks like a hockey stick. The purpose of the visual…
Most machine learning projects involve datasets with high dimensionality. Multiple feature reduction techniques have been proposed and used by statisticians and data scientists over the years. One of those techniques involve a visual inspection of an ordered list of features, where the order is determined based on a statistical measure such as absolute correlation with the target (dependent) feature. The plot of such ordered list typically looks like a hockey stick. The purpose of the visual examination is to detect an inflection (aka ‘elbow’) in that L-shaped curve that would suggest a cut-off point. Once an elbow is detected, feature reduction is achieved by selecting a fewer number of top features that belong to one side of the elbow.
In this presentation, I discuss three unique approaches to automatically detect an elbow from an ordered list of features. These methods allow data scientists to avoid having to rely on the manual step of visual inspection. The three methods that I discuss involve (1) an application of Kolmogorov-Smirnov statistic, (2) calculating the profile likelihood function, and (3) calculating the Euclidean distance from the origin, respectively. Based on simulated data, results from these methods are compared and displayed for visual scrutiny. In addition, the speed and complexity of code are also considered while assessing the efficacy and expediency of these three approaches. -
A Comparison of Three Statistical Methods to Evaluate their Classification Accuracy
Southeast Decision Sciences Institute (SEDSI) 2012
Accurately classifying observations into predefined groups is a commonly encountered multivariate
analysis problem. When the dependent variable has only two levels, binary logistic regression is the most widely used approach. However, researchers often come across dependent variables with more than two levels: for example, 1=Good, 2=Medium and 3=Low. When a qualitative dependent variable has three or more levels, there are several analytical techniques that a researcher can use to try to…Accurately classifying observations into predefined groups is a commonly encountered multivariate
analysis problem. When the dependent variable has only two levels, binary logistic regression is the most widely used approach. However, researchers often come across dependent variables with more than two levels: for example, 1=Good, 2=Medium and 3=Low. When a qualitative dependent variable has three or more levels, there are several analytical techniques that a researcher can use to try to correctly classify observations into one of those mutually exclusive categories for the dependent variable. The purpose of this study is to compare the results of various analytical methods to statistically evaluate which method performs better in terms of achieving high classification accuracy.
Honors & Awards
-
Felix Baumgartner (Best Employee) Award
Razorsight
-
HAVAS Star Employee of the Year
Havas Discovery
-
Most Innovative Solution
DMA Analytics Challenge 2011
The goal of this challenge was to create an analytical framework that successfully predicts an individual’s response to a given e-mail promotion program. A historical sample of e-mail promotions of 500,000 individuals and their associated demographic characteristics were provided as the challenge data set by Hearst and the Experian Corporation.
Our team placed a close second, and also won the Most Innovative Solution award in this competition that received over 500 entries from 51…The goal of this challenge was to create an analytical framework that successfully predicts an individual’s response to a given e-mail promotion program. A historical sample of e-mail promotions of 500,000 individuals and their associated demographic characteristics were provided as the challenge data set by Hearst and the Experian Corporation.
Our team placed a close second, and also won the Most Innovative Solution award in this competition that received over 500 entries from 51 countries. -
Euro (Havas) Star Employee of the Year
Havas Discovery
-
Second Place in the DMA Analytics Challenge 2009
DMA (Direct Marketing Association)
The challenge was to apply primary research segmentation scheme to an entire customer base. Challenge Participants were provided this data to build a classification algorithm for assigning the segments to the customer database. Data included: behavioral transaction summaries, demography/lifestyles/interests, census data, and Silhouettes Segments.
More activity by Vishal
-
Curious about what habits the world's greatest business leaders follow? Well, for starters, they're all rooted in consistency. Check out the 5 habits…
Curious about what habits the world's greatest business leaders follow? Well, for starters, they're all rooted in consistency. Check out the 5 habits…
Liked by Vishal Patel
Other similar profiles
Explore collaborative articles
We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.
Explore MoreOthers named Vishal Patel in United States
823 others named Vishal Patel in United States are on LinkedIn
See others named Vishal Patel