About
Binil Kuriachan is a Sr. Applied Scientist at Microsoft (R&D) and part of Cybersecurity…
Articles by Binil
-
Data Science - How to start a career in data science
Data Science - How to start a career in data science
By Binil Kuriachan
Activity
-
⚡ Ever wondered how AI can stop those sneaky email scams? 💥 Well, you’re in for a treat! 🚨 We've got Binil Kuriachan and Renuka Talegaon from…
⚡ Ever wondered how AI can stop those sneaky email scams? 💥 Well, you’re in for a treat! 🚨 We've got Binil Kuriachan and Renuka Talegaon from…
Liked by Binil Kuriachan
-
I am happy to share that I have received my PhD in Artificial Intelligence today from the Indian Institute of Technology, Madras. I thank my…
I am happy to share that I have received my PhD in Artificial Intelligence today from the Indian Institute of Technology, Madras. I thank my…
Liked by Binil Kuriachan
-
Last week Bing turned 15 years old. Everyone on the team should feel super proud of the incredible work over the years. As I said to them, I think…
Last week Bing turned 15 years old. Everyone on the team should feel super proud of the incredible work over the years. As I said to them, I think…
Liked by Binil Kuriachan
Experience
Education
-
Indian Institute of Technology, Madras
Activities and Societies: Text Mining, Machine Learning, Data Preprocessing, Algorithms, Natural Language Processing
Worked on a research problem in Text Mining and Personalization using advanced analytics techniques. My work was presented in an international conference in Design Science (DESRIST) and research paper is published in Springer journal.
* Candidate for MS Research * -
Activities and Societies: Ira A. Fulton Schools of Engineering at Arizona State University.
Masters in Computer science - Machine Learning Specialization.
Algorithm, Data Structures, Deep Learning, Machine Learning, Statistical Modeling, Data Mining
Completed with - GPA 4/4 -
Activities and Societies: IEEE, Programming, Code Debugging, Microprocessor Programming
Algorithms, Data Structures, Programming, Operating System, Data Base Management Systems, Compilers.
* Received Best Project Awards
* Won First Prize in Design Competition - Conducted by TCS
* Won Prizes for Programming Competitions -
Activities and Societies: Data Science , Deep Learning, Machine Learning, Artificial Intelligence, NLP,Statistical Modelling
Advanced Specialisations in Data Science, Deep Learning, Statistical Modelling, Natural Language Processing and Programming (Python and R). These courses offer deep understanding in both theoretical and practical aspects in the respective area supplemented with well structured assignments and capstone projects. Individual courses are offered by well reputed universities like Stanford, University of Washington, University of Michigan, John Hopkins etc.
Licenses & Certifications
-
End-to-End Machine Learning with TensorFlow on GCP
Google Cloud (NJ)
-
Intermediate Python for Data Science
DataCamp
Credential ID 692c58c86c9df0dd9b682da08145d6e9717b31fb
Publications
-
DEEPCC: MULTIMODAL CROP CLASSIFICATION USING SATELLITE DATA AND CROP ROTATION PATTERNS
Microsoft Journal for Applied Research
Identification of crop types early in the growing season aids in many pre-harvest decisions, including monitoring food security, directing the best use of the landscape, supporting agricultural policy, enhancing supply chain efficiency through crop yield forecasting, and estimating crop water needs for irrigation planning. Therefore, there would be substantial advantages if a model could accurately predict crop type early or mid-season. In this study, using temporal patterns from multitemporal…
Identification of crop types early in the growing season aids in many pre-harvest decisions, including monitoring food security, directing the best use of the landscape, supporting agricultural policy, enhancing supply chain efficiency through crop yield forecasting, and estimating crop water needs for irrigation planning. Therefore, there would be substantial advantages if a model could accurately predict crop type early or mid-season. In this study, using temporal patterns from multitemporal satellite images and crop rotation patterns from the previous year, we developed a robust deep learning model that can classify various crop types. This model can predict the crop type at the pixel level for a given boundary with latitude/longitude coordinates. CDL data is used in our study to label training samples on satellite images for crop type classification and to track crop rotation patterns from previous years. The greenness and density of the vegetation visible in satellite images are measured using the Normalized Difference Vegetation Index (NDVI), which exhibits different temporal patterns for different crops during a growing season. We have selected a sample of pixel-level NDVI for each crop type over a specified period to train a deep learning network to recognize temporal signals. Crop rotation is used to promote soil health and increase the amount of nutrients available for crops since monocropping, the practice of growing the same crop in the same location for an extended period, gradually depletes the soil quality. To give the model more intelligence, crop rotation signals from prior years are incorporated. The LSTM (Long Short-Term Memory) deep learning model is developed utilizing temporal inputs to attain an overall accuracy of 85%. The accuracy is observed to be improved by using crop rotation signals as a multi-headed ensemble model. Our suggested method can be applied to a variety of scenarios, including early, mid, and late season predictions.
Other authors -
AI Enabled Context Sensitive Information Retrieval System
Springer International Publishing
Developing Smart Search which has the capabilities of Key Word search with reduced need of indexing along with Artificial Intelligence that helps in understanding the semantics of user query and find most relevant information. We have developed indigenous Named Entity Recognition and custom function to identify entities from the search text. Key Word search module of Smart Search pre-process the document corpus and transforms it into distributable matrices comprising of numerical scores for…
Developing Smart Search which has the capabilities of Key Word search with reduced need of indexing along with Artificial Intelligence that helps in understanding the semantics of user query and find most relevant information. We have developed indigenous Named Entity Recognition and custom function to identify entities from the search text. Key Word search module of Smart Search pre-process the document corpus and transforms it into distributable matrices comprising of numerical scores for each words present in each documents. The matrix can be distributed for efficient computation and an index is maintained which explains the word and its column position. Numerical scores can be calculated using many methods such as TF-IDF, count vectorizer, binary vectorizer etc. Once the search text arrives, it is also transformed into a vector of numerical scores corresponding to the number. The relevant documents can be identified by finding the dot product of input vectors and respective vectors from corpus matrix. In this work, various Machine Learning architectures are also experimented on the document corpus to calculate a word embedding for each word based on its context in the documents. Word Embedding represents a word into a vector of numbers where each value can be seen as a representation for different features which explain the word. Embeddings such as Word2Vec and Glove are used and weights are updated from the corpus. The document corpus is also used to learn the proportion of different topics using Latent Dirichlet Allocation algorithm which is one of the popular unsupervised learning method for identifying latent topics from text document. The calculated Word Embedding and LDA are used to understand the semantic similarity of user query and documents. In order to limit the search space, classification models are trained from the dataset. Smart Search encompasses the best capabilities of learning from the data as well as the Key Word search approach.
Other authorsSee publication -
Assurance of Supply (AoS) Analytics: Predictive Models on Purchase Order Shipping and Delivery Schedules
Boeing Technical Conference
Part shortages has long been a stumbling block that hinders the smooth functioning of Boeing assembly lines. The Assurance of Supply Analytics Platform (AoSAP), is an application being developed in this regard, to help users identify and address the drivers of part shortages across the integrated value chain from program start-up through engineering requirements release, purchase order placement, supplier delivery, and manufacturing build. It has been observed that among the drivers of part…
Part shortages has long been a stumbling block that hinders the smooth functioning of Boeing assembly lines. The Assurance of Supply Analytics Platform (AoSAP), is an application being developed in this regard, to help users identify and address the drivers of part shortages across the integrated value chain from program start-up through engineering requirements release, purchase order placement, supplier delivery, and manufacturing build. It has been observed that among the drivers of part shortages, supplier related delays (late deliveries and defective parts) alone contribute a significant amount. As a part of the Assurance of Supply (AoS) objective, we address some of these challenges by using predictive analytics that can help reduce supplier caused late deliveries for the 737 line. Delays can happen due to late shipping as well as due to issues with the transportation which results in the on-dock deliveries deviating from the scheduled dates. The focus areas of our work are: 1. A comprehensive understanding of the business process which enabled us to create a list of parameters which are correlated to supplier performance, shipping and on-dock delays, 2. Building machine learning models using historical discrete purchase order data to figure out patterns and predict possible shipping and on-dock delays for the purchase orders planned for the future dates. On-time delivery of planned purchase orders is crucial to maintain uninterrupted assembly line operations. Being able to know the possible delays in advance will help the operations team to plan accordingly and take actions to mitigate the situation thereby reducing the chances of it resulting in a part shortage.
Other authors -
ALDA: An Aggregated LDA for Polarity Enhanced Aspect Identification Technique in Mobile App Domain
Springer International Publishing
With the increased popularity of the smart mobile devices, mobile applications (a.k.a apps) have become essential. While the app developers face an extensive challenge to improve user satisfaction by exploiting the valuable feedbacks, the app users are overloaded with way too many apps. Extracting the valuable features from apps and mining the associated sentiments is of utmost
importance for the app developers. Similarly, from the user perspective, the key preferences should be identified…With the increased popularity of the smart mobile devices, mobile applications (a.k.a apps) have become essential. While the app developers face an extensive challenge to improve user satisfaction by exploiting the valuable feedbacks, the app users are overloaded with way too many apps. Extracting the valuable features from apps and mining the associated sentiments is of utmost
importance for the app developers. Similarly, from the user perspective, the key preferences should be identified. This work deals with profiling users and apps using a novel LDA based aspect identification technique. Polarity aggregation technique is used to tag the weak features of the apps the developers should concentrate on. The proposed technique has been experimented on an Android
review dataset to validate the efficacy compared to state-of-the-art algorithms. Experimental findings suggest superiority and applicability of our model in practical scenarios.Other authorsSee publication -
Comment Prediction in Facebook Pages using Regression Techniques
IJRASET - International Journal for Research in Applied Science and Engineering Technology
Data in Social Networks is increasing day by day. It requires highly managing service to handle the largeamount of data towards it. This work is about to study the user activity patterns in Social Networks. So, concentratedon active social Networks which is “Facebook” especially in Facebook Page. Here, user comment volume predictionis made based on page category i.e., for a particular category of page’s post will get certain amount of comments. Inorder to predict the comment volume for each…
Data in Social Networks is increasing day by day. It requires highly managing service to handle the largeamount of data towards it. This work is about to study the user activity patterns in Social Networks. So, concentratedon active social Networks which is “Facebook” especially in Facebook Page. Here, user comment volume predictionis made based on page category i.e., for a particular category of page’s post will get certain amount of comments. Inorder to predict the comment volume for each page and to find which page category getting the highest comment. Inpreliminary work, it has been concluded with decision tree. So, In Further Study, have analyzed with some moreRegression Techniques to make the prediction Effective. In this work, modelled the user comment pattern withrespect to Page Likes and Popularity, Page Category and Time. Here Decision Tree, LASSO, K-Nearest Neighbor(KNN), Random Forest, and Leaner Regression Techniques are used. The error is found by Root Mean Square Error(RMSE) Metrics. Then, concluded that K-Nearest Neighbor Algorithm performing well and giving the effectiveprediction
Other authorsSee publication -
Chronic Kidney Disease Analysis using Machine Learning Algorithms
IJRASET - International Journal for Research in Applied Science and Engineering Technology
Now-a-days, Data Mining technology has become the trend for diagnostic result. Innumerable
efforts were put on to cope with the explosion of medical data, retrieving useful information from it and making prediction with availed information. The main objective of this research paper is predicting the presence of the Chronic Kidney Disease(CKD) in patients by using various classification algorithms like Naïve Bayes, Random Forest, Logistic Regression, Linear Discriminant Analysis(LDA) and…Now-a-days, Data Mining technology has become the trend for diagnostic result. Innumerable
efforts were put on to cope with the explosion of medical data, retrieving useful information from it and making prediction with availed information. The main objective of this research paper is predicting the presence of the Chronic Kidney Disease(CKD) in patients by using various classification algorithms like Naïve Bayes, Random Forest, Logistic Regression, Linear Discriminant Analysis(LDA) and Quadratic Discriminant Analysis(QDA) and also finding the most predictive and influencing features on target variable by using feature selection technique (Chi-Square test) and Gini impurity index. R tool is used for implementing these classification techniques. Performance of all stated algorithms are being compared by considering accuracy, precision and recall in order to determine the best classifier for predicting the patients having chronic kidney diseaseOther authorsSee publication
Courses
-
Applied Machine Learning
University of Michigan
-
Applied Text Mining
University of Michigan
-
CSE 511 - Data Processing at Scale
CSE - 511
-
CSE 551 - Foundations of Algorithms
CSE 551
-
CSE 565 - Software Verification Validation & Test
CSE 565
-
CSE 571 - Artificial Intelligence
CSE 571
-
CSE 572 - Data Mining
CSE-572
-
CSE 575 - Statistical Machine Learning
CSE 575
-
CSE 598 - Deep Learning for Computer Vision
CSE 598
-
Convolutional Neural Networks
-
-
Data Science using Python
University of Michigan
-
Getting and Cleaning Data
John Hopkins University
-
Improving Deep Neural Networks
Deeplearning.ai
-
Machine Learning Specialization
University of Washington
-
Machine Learning: Classification
University of Washington
-
Machine Learning: Regression
University of Washington
-
Neural Networks and Deep Learning
SNFRHANPCF97
-
Sequence Models (RNN)
JRHUZN8ZXJGD
-
Statistical Inference
John Hopkins University
-
Statistical Learning
Stanford University
Projects
-
Custom Deep Models for Phishing Mail Detection focusing on Business Email Compromise
Business Email Compromise (BEC) is one of the major Phishing mail categories that pertains major threat to the Enterprise Customers. In 2022, BEC led to loss of ~3 Billion dollars in US alone. I have been working on building an efficient solution to detect and prevent BEC attacks originates from Email for Enterprise.
Focus areas include:
* Traffic selection
* Data pipeline for training
* ML system design
* Design and Build Scalable solution
* Deployment
* Real-time…Business Email Compromise (BEC) is one of the major Phishing mail categories that pertains major threat to the Enterprise Customers. In 2022, BEC led to loss of ~3 Billion dollars in US alone. I have been working on building an efficient solution to detect and prevent BEC attacks originates from Email for Enterprise.
Focus areas include:
* Traffic selection
* Data pipeline for training
* ML system design
* Design and Build Scalable solution
* Deployment
* Real-time Inference -
Microsoft Security Copilot: Web Grader
-
Built and deployed LLM Grader for Web as part of Microsoft Security Copilot
-
Web Defense & Anti Phishing Module – Microsoft Defender for Office
-
Building AI solutions for Cybersecurity in M365 product Suite
Phishing Mail Identification & Generation (Attack Simulation Module)
Predicted Compromise Rate, Phish Susceptibility Score (User level)
Web Defense System for Web Protection -
Crop Classification using multi-spectral Geo-Spatial Data (Satellite/Weather)
-
Crop Classification using multi-spectral Geo-Spatial Data (Satellite/Weather) for Azure FarmBeats platform. Identification of crop types early in the growing season aids in many pre-harvest decisions, including monitoring food security, directing the best use of the landscape, supporting agricultural policy, enhancing supply chain efficiency through crop yield forecasting, and estimating crop water needs for irrigation planning. Therefore, there would be substantial advantages if a model could…
Crop Classification using multi-spectral Geo-Spatial Data (Satellite/Weather) for Azure FarmBeats platform. Identification of crop types early in the growing season aids in many pre-harvest decisions, including monitoring food security, directing the best use of the landscape, supporting agricultural policy, enhancing supply chain efficiency through crop yield forecasting, and estimating crop water needs for irrigation planning. Therefore, there would be substantial advantages if a model could accurately predict crop type early or mid-season. We built a robust Deep Learning Model that can classify the crop types by using temporal patterns emerging from multitemporal satellite images and past year's crop rotation patterns. This model can predict the crop type at the pixel level for a given boundary with latitude/longitude coordinates. CDL data is used in our study to label training samples on satellite images for crop type classification and to track crop rotation patterns from previous years. The greenness and density of the vegetation visible in satellite images are measured using the Normalized Difference Vegetation Index (NDVI), which exhibits different temporal patterns for different crops during a growing season. Crop rotation is used to promote soil health and increase the amount of nutrients available for crops since monocropping, the practice of growing the same crop in the same location for an extended period, gradually depletes the soil quality. To give the model more intelligence, crop rotation signals from prior years are incorporated. A sequence-to-sequence deep learning model is developed utilizing crop rotation signals and temporal inputs to attain an overall accuracy of 90%. The accuracy can be improved by using more temporal data over a growing season. Due to its multimodality, our suggested method can be applied to a variety of scenarios, including early, mid, and late season predictions.
-
Azure FarmBeats: NDVI forecast using satellite imagery and weather data
-
Azure FarmBeats is a business-to-business offering available in Azure Marketplace. It enables aggregation of agriculture data sets across providers. Azure FarmBeats enables you to build artificial intelligence (AI) or machine learning (ML) models based on fused data sets. By using Azure FarmBeats, agriculture businesses can focus on core value-adds instead of the undifferentiated heavy lifting of data engineering. As part of FarmBeats, we built an NDVI forecast model using 60 days of historical…
Azure FarmBeats is a business-to-business offering available in Azure Marketplace. It enables aggregation of agriculture data sets across providers. Azure FarmBeats enables you to build artificial intelligence (AI) or machine learning (ML) models based on fused data sets. By using Azure FarmBeats, agriculture businesses can focus on core value-adds instead of the undifferentiated heavy lifting of data engineering. As part of FarmBeats, we built an NDVI forecast model using 60 days of historical satellite imagery, 30 days of historical weather and 10 days of forecasted weather data. This project demonstrate different capability of Azure FarmBeats and how it can be used effectively for building end-to-end machine learning models for agriculture and food industry.
-
Data Imputation in geospatial time series data with advanced ML techniques
-
Data Imputation in geospatial time series data with deep learning techniques.
-
Smart Search – AI Enabled Scalable and Efficient Context Sensitive Information Retrieval System
-
Developing Smart Search which has the capabilities of Key Word search with reduced need of indexing along with Artificial Intelligence that helps in understanding the semantics of user query and find most relevant information. We have developed indigenous Named Entity Recognition and custom function to identify entities from the search text. Key Word search module of Smart Search pre-process the document corpus and transforms it into distributable matrices comprising of numerical scores for…
Developing Smart Search which has the capabilities of Key Word search with reduced need of indexing along with Artificial Intelligence that helps in understanding the semantics of user query and find most relevant information. We have developed indigenous Named Entity Recognition and custom function to identify entities from the search text. Key Word search module of Smart Search pre-process the document corpus and transforms it into distributable matrices comprising of numerical scores for each words present in each documents. The matrix can be distributed for efficient computation and an index is maintained which explains the word and its column position. Numerical scores can be calculated using many methods such as TF-IDF, count vectorizer, binary vectorizer etc. Once the search text arrives, it is also transformed into a vector of numerical scores corresponding to the number. The relevant documents can be identified by finding the dot product of input vectors and respective vectors from corpus matrix. In this work, various Machine Learning architectures are also experimented on the document corpus to calculate a word embedding for each word based on its context in the documents. Word Embedding represents a word into a vector of numbers where each value can be seen as a representation for different features which explain the word. Embeddings such as Word2Vec and Glove are used and weights are updated from the corpus. The document corpus is also used to learn the proportion of different topics using Latent Dirichlet Allocation algorithm which is one of the popular unsupervised learning method for identifying latent topics from text document. The calculated Word Embedding and LDA are used to understand the semantic similarity of user query and documents. In order to limit the search space, classification models are trained from the dataset. Smart Search encompasses the best capabilities of learning from the data as well as the Key Word search approach.
-
Assurance of Supply (AoS) Analytics
-
The AoS Analytics Platform is developed to help users identify and address the drivers of part shortages across the integrated value chain from program start-up through engineering requirements release, purchase order placement, supplier delivery, and manufacturing build. This project focus on building insights and use predictive analytics capability to help reducing part shortages and efficient planning.
-
Future State Architecture Tools -Deep Learning based Named Entity Recognition Model for detecting and categorizing Keywords
-
FSAT is an architecture tool which helps users by suggesting the possible future architecture of an application based on the current architecture of the tools. In this project, we created a Deep Learning based NER model with Bidirectional LSTM (Long Short Term Memory) Network backed with Convolutional Neural Networks (character, word and case embedding) for detecting technical keywords from Boeing Propitiatory technical documents and associating it with the respective categories. Model…
FSAT is an architecture tool which helps users by suggesting the possible future architecture of an application based on the current architecture of the tools. In this project, we created a Deep Learning based NER model with Bidirectional LSTM (Long Short Term Memory) Network backed with Convolutional Neural Networks (character, word and case embedding) for detecting technical keywords from Boeing Propitiatory technical documents and associating it with the respective categories. Model performance was evaluated with f1 score.
Tools and Technology: TensorFlow, Python, NLP, CNN, LSTM
-
Flight Assembly Line Move - Deep Neural Network Models for Object Detection and Sequence Modeling
-
Line Move project focus on identifying different equipment (using camera feed) being used during flight assembly and predicting the possible sequence for better organizing and understanding anomalies.
In this project, two separate models (Deep Learning - CNN and RNN) for Object Detection and Sequence Modeling. In the Object detection task for identifying various equipment being used in the factory line move of flight assembly, we leveraged few of the state-of-the-art architectures like…Line Move project focus on identifying different equipment (using camera feed) being used during flight assembly and predicting the possible sequence for better organizing and understanding anomalies.
In this project, two separate models (Deep Learning - CNN and RNN) for Object Detection and Sequence Modeling. In the Object detection task for identifying various equipment being used in the factory line move of flight assembly, we leveraged few of the state-of-the-art architectures like YOLO, Fast-RCNN, Mask-RCNN etc and trained it on custom images. After evaluating the performance, Model created with YOLO was selected for deployment. For the Sequence modeling task of identifying equipment sequence, we created a Stacked LSTM network with 200 memory cells and variable length inputs to predict the upcoming equipment requirements.
Technology Used: Python, TensorFlow, Keras, Convolutional Neural Networks, Recurrent Neural Networks, Sequence to Sequence Models, LSTM etc.
-
Deep Neural based Encoder-Decoder model for Multi-Level Classification Problem
-
Created a Deep Neural based Encoder-Decoder sequence model for handling multilevel classification at single shot. In this project, we had input text which consists of customer facing issues which usually end up in multiple queues for getting the final fix. Our aim was to identify the sequence of queues without any manual intervention. We could solve it by using a sequence model with Encoder-Decoder architecture as input and output records have multiple length.
Tools and Technologies…Created a Deep Neural based Encoder-Decoder sequence model for handling multilevel classification at single shot. In this project, we had input text which consists of customer facing issues which usually end up in multiple queues for getting the final fix. Our aim was to identify the sequence of queues without any manual intervention. We could solve it by using a sequence model with Encoder-Decoder architecture as input and output records have multiple length.
Tools and Technologies Used: Python, Tensorflow, Deep Learning, Sequence Models, Encoder - Decoder architecture, LSTM, GRU etc. -
Incident Management Auto-response Prediction
-
In this Work, we build a multi-phase Machine Learning model which predicts the contents of the email and trigger the response template to the user. We build a first level Multinomial Classification model (SVM, Ensemble Models, Deep Learning etc..) which predicts the high level groups and subsequently employed a Topic Model using Latent Dirichlet Allocation (LDA) to figure out the topic distribution. Tools and Technology: NLTK, Scikit-Learn, Python, Bag-Of-Words, Vectorization, Data Mining…
In this Work, we build a multi-phase Machine Learning model which predicts the contents of the email and trigger the response template to the user. We build a first level Multinomial Classification model (SVM, Ensemble Models, Deep Learning etc..) which predicts the high level groups and subsequently employed a Topic Model using Latent Dirichlet Allocation (LDA) to figure out the topic distribution. Tools and Technology: NLTK, Scikit-Learn, Python, Bag-Of-Words, Vectorization, Data Mining, Nearest Neighbors, Document Similarity, Machine Learning , Support Vector Machines, Logistic Regression With Regularization, Random Forest, Word Embedding, Semantic Analysis, Deep Learning.
-
Best Resolution Prediction
-
In this project, we build a Knowledge Base Resolution prediction by using Vectorization along with Semantic Document Similarity on previous ‘Incident Records’ as well as ‘Knowledge Articles’. For improving the accuracy of prediction, incident and knowledge records are filtered by the configuration item using supervised Machine Learning Classification Models (Deep Learning, CNN, Word Embedding, SVM, Ensemble Models, Logistic Regression, Naive Bayes). Tools and Technology: NLTK, Scikit-Learn…
In this project, we build a Knowledge Base Resolution prediction by using Vectorization along with Semantic Document Similarity on previous ‘Incident Records’ as well as ‘Knowledge Articles’. For improving the accuracy of prediction, incident and knowledge records are filtered by the configuration item using supervised Machine Learning Classification Models (Deep Learning, CNN, Word Embedding, SVM, Ensemble Models, Logistic Regression, Naive Bayes). Tools and Technology: NLTK, Scikit-Learn, Python, Bag-Of-Words, Vectorization, Data Mining, Nearest Neighbors, Document Similarity, Machine Learning , Support Vector Machines, Logistic Regression With Regularization, Random Forest, Word Embedding, Semantic Analysis, Deep Learning, Keras, TensoFlow.
-
ALDA: An Aggregated LDA for Polarity Enhanced Aspect Identification Technique in Mobile App Domain
-
With the increased popularity of the smart mobile devices, mobile applications (a.k.a apps) have become essential. While the app developers face an extensive challenge to improve user satisfaction by exploiting the valuable feedback, the app users are overloaded with way too many apps. Extracting the
valuable features from apps and mining the associated sentiments is of utmost importance for the app developers. Similarly, from the user perspective, the key preferences should be identified…With the increased popularity of the smart mobile devices, mobile applications (a.k.a apps) have become essential. While the app developers face an extensive challenge to improve user satisfaction by exploiting the valuable feedback, the app users are overloaded with way too many apps. Extracting the
valuable features from apps and mining the associated sentiments is of utmost importance for the app developers. Similarly, from the user perspective, the key preferences should be identified. This work deals with profiling users and apps using a novel LDA based aspect identification technique. Polarity aggregation technique is used to tag the weak features of the apps the developers should concentrate on. The proposed technique has been experimented on an Android review dataset to validate the efficacy compared to state-of-the-art algorithms. Experimental findings suggest superiority and applicability of our model in practical scenarios.
Tools and Technologies: Latent Dirichlet Allocation (LDA), Semantic Analysis, Python, R, Natural Language Processing, NLTK, Scikit-Learn, Tableau, Machine Learning -
Aspect Level Sentiment Analysis from Free Form Text
-
This project is to identify the respective sentiment distribution across multiple topics from the free form text (feedback/survey/Collaboration tools, Chat data etc.). We built a LDA based topic modeling clubbed with Dictionary as well as ML based sentiment Analysis to aggregate the respective polarity across each topics. The results are used for management decision and fine tuning the respective work areas. For Aspect Identification, we developed Latent Dirichlet Allocation (LDA) model and for…
This project is to identify the respective sentiment distribution across multiple topics from the free form text (feedback/survey/Collaboration tools, Chat data etc.). We built a LDA based topic modeling clubbed with Dictionary as well as ML based sentiment Analysis to aggregate the respective polarity across each topics. The results are used for management decision and fine tuning the respective work areas. For Aspect Identification, we developed Latent Dirichlet Allocation (LDA) model and for ML based Sentiment Analysis, models using SVM, CNN+ RNN (LSTM), Logistic Regression with L2 Regularization, Naïve Bayes and Random Forest are evaluated. TextBlob is used for Dictionary based Sentiment Analysis
Tools and Technology: NLTK, Scikit-Learn, Keras, TensorFlow, Python, Bag-Of-Words, Word Embedding, Word2Vec, Vectorization, Machine Learning , Support Vector Machines, Logistic Regression With Regularization, Random Forest, Word Embedding, Semantic Analysis, Deep Learning, Gensim. -
Predicting the Deviation color Metric for Large Scale Printers
-
Data from machines were given and we created a regression model to predict the range and deviation of the outcome variable which is one of the color metric provided given values for all other predictors .For each color, a threshold value for deviation were given. From the predicted deviation, it was used to identify the possible value of deviation which could occur in a particular situation. This data was used to adjust the values of significant predictors. Tools and Technologies: Python, R…
Data from machines were given and we created a regression model to predict the range and deviation of the outcome variable which is one of the color metric provided given values for all other predictors .For each color, a threshold value for deviation were given. From the predicted deviation, it was used to identify the possible value of deviation which could occur in a particular situation. This data was used to adjust the values of significant predictors. Tools and Technologies: Python, R, Regression Models (Multivariate, Ridge, Lasso, KNN, Tree Models), Classification Models (Logistic Regression, Random Forest), Deep Neural Networks, Scikit-Learn, SQL Developer
-
Survival Analysis
-
Analyze the data generated from the machine and predicting the time to the occurrence of next event for the machines. Data which occurs during the time of failure and normal conditions are taken and from that, survival model was created by also considering the problems of censoring, which will to predict the time remaining for the next occurrence of failure. These results were used to plan the maintenance activities of the respective machines. Along with this, respective survival metrics and…
Analyze the data generated from the machine and predicting the time to the occurrence of next event for the machines. Data which occurs during the time of failure and normal conditions are taken and from that, survival model was created by also considering the problems of censoring, which will to predict the time remaining for the next occurrence of failure. These results were used to plan the maintenance activities of the respective machines. Along with this, respective survival metrics and performance of different models of machines were analyzed. Tools and Technologies: Python, R, Survival Analysis (Kaplen Meier, CoxPH Regression), Censoring, Scikit-Learn, Survival, SQL Developer
Honors & Awards
-
Exceptional Performance Award
Boeing
Delivering key machine learning in the areas of BMS Smart Search, AOS Analytics, AIops & GDA - Synthetic Data Generation - Cash Award of $1000.00
-
Speaker - deeplearning.ai
deeplearning.ai
Honored to be selected as speaker/panelist for deeplearning.ai meetup organized in India.
-
Boeing Exceptional Performance Award
Boeing
For exceptional performance in building Future State Architecture Tool - Named Entity Recognition (Deep Learning) and ChatBot
-
Boeing Exceptional Performance Award
Boeing
Recognised for exceptional performance in providing Deep Learning solutions for Boeing's Line Move project
-
Live Wire - Award for Exceptional Performance
-
Award for Exceptional Performance - Building Predictive Models for Konica Minolta
-
Spot Award
-
Award for exceptional performance in building a high end predictive analytics solution and assisting the deployment
-
Crown - Bravo Award
-
Award for exceptional Performance
Languages
-
Malayalam
Native or bilingual proficiency
-
English
Native or bilingual proficiency
Recommendations received
-
LinkedIn User
7 people have recommended Binil
Join now to viewMore activity by Binil
-
I've been at Microsoft for just over a month. It's been an exhilarating journey getting to know a huge number of people, organizations and projects…
I've been at Microsoft for just over a month. It's been an exhilarating journey getting to know a huge number of people, organizations and projects…
Liked by Binil Kuriachan
-
I’m often asked what problem I’d solve if I were to start another company. I probably won’t do a startup any time soon (because startups are hard)…
I’m often asked what problem I’d solve if I were to start another company. I probably won’t do a startup any time soon (because startups are hard)…
Liked by Binil Kuriachan
Other similar profiles
Explore collaborative articles
We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.
Explore More