Binil Kuriachan

Binil Kuriachan

Bengaluru, Karnataka, India
11K followers 500+ connections

About

Binil Kuriachan is a Sr. Applied Scientist at Microsoft (R&D) and part of Cybersecurity…

Articles by Binil

Activity

Join now to see all activity

Experience

  • Microsoft Graphic

    Microsoft

    Bengaluru, Karnataka, India

  • -

  • -

    Bangalore Urban, Karnataka, India

  • -

  • -

  • -

    India

Education

  • Indian Institute of Technology, Madras Graphic

    Indian Institute of Technology, Madras

    Activities and Societies: Text Mining, Machine Learning, Data Preprocessing, Algorithms, Natural Language Processing

    Worked on a research problem in Text Mining and Personalization using advanced analytics techniques. My work was presented in an international conference in Design Science (DESRIST) and research paper is published in Springer journal.
    * Candidate for MS Research *

  • Activities and Societies: Ira A. Fulton Schools of Engineering at Arizona State University.

    Masters in Computer science - Machine Learning Specialization.
    Algorithm, Data Structures, Deep Learning, Machine Learning, Statistical Modeling, Data Mining

    Completed with - GPA 4/4

  • Activities and Societies: IEEE, Programming, Code Debugging, Microprocessor Programming

    Algorithms, Data Structures, Programming, Operating System, Data Base Management Systems, Compilers.

    * Received Best Project Awards
    * Won First Prize in Design Competition - Conducted by TCS
    * Won Prizes for Programming Competitions

  • Activities and Societies: Data Science , Deep Learning, Machine Learning, Artificial Intelligence, NLP,Statistical Modelling

    Advanced Specialisations in Data Science, Deep Learning, Statistical Modelling, Natural Language Processing and Programming (Python and R). These courses offer deep understanding in both theoretical and practical aspects in the respective area supplemented with well structured assignments and capstone projects. Individual courses are offered by well reputed universities like Stanford, University of Washington, University of Michigan, John Hopkins etc.

Licenses & Certifications

Publications

  • DEEPCC: MULTIMODAL CROP CLASSIFICATION USING SATELLITE DATA AND CROP ROTATION PATTERNS

    Microsoft Journal for Applied Research

    Identification of crop types early in the growing season aids in many pre-harvest decisions, including monitoring food security, directing the best use of the landscape, supporting agricultural policy, enhancing supply chain efficiency through crop yield forecasting, and estimating crop water needs for irrigation planning. Therefore, there would be substantial advantages if a model could accurately predict crop type early or mid-season. In this study, using temporal patterns from multitemporal…

    Identification of crop types early in the growing season aids in many pre-harvest decisions, including monitoring food security, directing the best use of the landscape, supporting agricultural policy, enhancing supply chain efficiency through crop yield forecasting, and estimating crop water needs for irrigation planning. Therefore, there would be substantial advantages if a model could accurately predict crop type early or mid-season. In this study, using temporal patterns from multitemporal satellite images and crop rotation patterns from the previous year, we developed a robust deep learning model that can classify various crop types. This model can predict the crop type at the pixel level for a given boundary with latitude/longitude coordinates. CDL data is used in our study to label training samples on satellite images for crop type classification and to track crop rotation patterns from previous years. The greenness and density of the vegetation visible in satellite images are measured using the Normalized Difference Vegetation Index (NDVI), which exhibits different temporal patterns for different crops during a growing season. We have selected a sample of pixel-level NDVI for each crop type over a specified period to train a deep learning network to recognize temporal signals. Crop rotation is used to promote soil health and increase the amount of nutrients available for crops since monocropping, the practice of growing the same crop in the same location for an extended period, gradually depletes the soil quality. To give the model more intelligence, crop rotation signals from prior years are incorporated. The LSTM (Long Short-Term Memory) deep learning model is developed utilizing temporal inputs to attain an overall accuracy of 85%. The accuracy is observed to be improved by using crop rotation signals as a multi-headed ensemble model. Our suggested method can be applied to a variety of scenarios, including early, mid, and late season predictions.

    Other authors
  • AI Enabled Context Sensitive Information Retrieval System

    Springer International Publishing

    Developing Smart Search which has the capabilities of Key Word search with reduced need of indexing along with Artificial Intelligence that helps in understanding the semantics of user query and find most relevant information. We have developed indigenous Named Entity Recognition and custom function to identify entities from the search text. Key Word search module of Smart Search pre-process the document corpus and transforms it into distributable matrices comprising of numerical scores for…

    Developing Smart Search which has the capabilities of Key Word search with reduced need of indexing along with Artificial Intelligence that helps in understanding the semantics of user query and find most relevant information. We have developed indigenous Named Entity Recognition and custom function to identify entities from the search text. Key Word search module of Smart Search pre-process the document corpus and transforms it into distributable matrices comprising of numerical scores for each words present in each documents. The matrix can be distributed for efficient computation and an index is maintained which explains the word and its column position. Numerical scores can be calculated using many methods such as TF-IDF, count vectorizer, binary vectorizer etc. Once the search text arrives, it is also transformed into a vector of numerical scores corresponding to the number. The relevant documents can be identified by finding the dot product of input vectors and respective vectors from corpus matrix. In this work, various Machine Learning architectures are also experimented on the document corpus to calculate a word embedding for each word based on its context in the documents. Word Embedding represents a word into a vector of numbers where each value can be seen as a representation for different features which explain the word. Embeddings such as Word2Vec and Glove are used and weights are updated from the corpus. The document corpus is also used to learn the proportion of different topics using Latent Dirichlet Allocation algorithm which is one of the popular unsupervised learning method for identifying latent topics from text document. The calculated Word Embedding and LDA are used to understand the semantic similarity of user query and documents. In order to limit the search space, classification models are trained from the dataset. Smart Search encompasses the best capabilities of learning from the data as well as the Key Word search approach.

    Other authors
    See publication
  • Assurance of Supply (AoS) Analytics: Predictive Models on Purchase Order Shipping and Delivery Schedules

    Boeing Technical Conference

    Part shortages has long been a stumbling block that hinders the smooth functioning of Boeing assembly lines. The Assurance of Supply Analytics Platform (AoSAP), is an application being developed in this regard, to help users identify and address the drivers of part shortages across the integrated value chain from program start-up through engineering requirements release, purchase order placement, supplier delivery, and manufacturing build. It has been observed that among the drivers of part…

    Part shortages has long been a stumbling block that hinders the smooth functioning of Boeing assembly lines. The Assurance of Supply Analytics Platform (AoSAP), is an application being developed in this regard, to help users identify and address the drivers of part shortages across the integrated value chain from program start-up through engineering requirements release, purchase order placement, supplier delivery, and manufacturing build. It has been observed that among the drivers of part shortages, supplier related delays (late deliveries and defective parts) alone contribute a significant amount. As a part of the Assurance of Supply (AoS) objective, we address some of these challenges by using predictive analytics that can help reduce supplier caused late deliveries for the 737 line. Delays can happen due to late shipping as well as due to issues with the transportation which results in the on-dock deliveries deviating from the scheduled dates. The focus areas of our work are: 1. A comprehensive understanding of the business process which enabled us to create a list of parameters which are correlated to supplier performance, shipping and on-dock delays, 2. Building machine learning models using historical discrete purchase order data to figure out patterns and predict possible shipping and on-dock delays for the purchase orders planned for the future dates. On-time delivery of planned purchase orders is crucial to maintain uninterrupted assembly line operations. Being able to know the possible delays in advance will help the operations team to plan accordingly and take actions to mitigate the situation thereby reducing the chances of it resulting in a part shortage.

    Other authors
  • ALDA: An Aggregated LDA for Polarity Enhanced Aspect Identification Technique in Mobile App Domain

    Springer International Publishing

    With the increased popularity of the smart mobile devices, mobile applications (a.k.a apps) have become essential. While the app developers face an extensive challenge to improve user satisfaction by exploiting the valuable feedbacks, the app users are overloaded with way too many apps. Extracting the valuable features from apps and mining the associated sentiments is of utmost
    importance for the app developers. Similarly, from the user perspective, the key preferences should be identified…

    With the increased popularity of the smart mobile devices, mobile applications (a.k.a apps) have become essential. While the app developers face an extensive challenge to improve user satisfaction by exploiting the valuable feedbacks, the app users are overloaded with way too many apps. Extracting the valuable features from apps and mining the associated sentiments is of utmost
    importance for the app developers. Similarly, from the user perspective, the key preferences should be identified. This work deals with profiling users and apps using a novel LDA based aspect identification technique. Polarity aggregation technique is used to tag the weak features of the apps the developers should concentrate on. The proposed technique has been experimented on an Android
    review dataset to validate the efficacy compared to state-of-the-art algorithms. Experimental findings suggest superiority and applicability of our model in practical scenarios.

    Other authors
    See publication
  • Comment Prediction in Facebook Pages using Regression Techniques

    IJRASET - International Journal for Research in Applied Science and Engineering Technology

    Data in Social Networks is increasing day by day. It requires highly managing service to handle the largeamount of data towards it. This work is about to study the user activity patterns in Social Networks. So, concentratedon active social Networks which is “Facebook” especially in Facebook Page. Here, user comment volume predictionis made based on page category i.e., for a particular category of page’s post will get certain amount of comments. Inorder to predict the comment volume for each…

    Data in Social Networks is increasing day by day. It requires highly managing service to handle the largeamount of data towards it. This work is about to study the user activity patterns in Social Networks. So, concentratedon active social Networks which is “Facebook” especially in Facebook Page. Here, user comment volume predictionis made based on page category i.e., for a particular category of page’s post will get certain amount of comments. Inorder to predict the comment volume for each page and to find which page category getting the highest comment. Inpreliminary work, it has been concluded with decision tree. So, In Further Study, have analyzed with some moreRegression Techniques to make the prediction Effective. In this work, modelled the user comment pattern withrespect to Page Likes and Popularity, Page Category and Time. Here Decision Tree, LASSO, K-Nearest Neighbor(KNN), Random Forest, and Leaner Regression Techniques are used. The error is found by Root Mean Square Error(RMSE) Metrics. Then, concluded that K-Nearest Neighbor Algorithm performing well and giving the effectiveprediction

    Other authors
    See publication
  • Chronic Kidney Disease Analysis using Machine Learning Algorithms

    IJRASET - International Journal for Research in Applied Science and Engineering Technology

    Now-a-days, Data Mining technology has become the trend for diagnostic result. Innumerable
    efforts were put on to cope with the explosion of medical data, retrieving useful information from it and making prediction with availed information. The main objective of this research paper is predicting the presence of the Chronic Kidney Disease(CKD) in patients by using various classification algorithms like Naïve Bayes, Random Forest, Logistic Regression, Linear Discriminant Analysis(LDA) and…

    Now-a-days, Data Mining technology has become the trend for diagnostic result. Innumerable
    efforts were put on to cope with the explosion of medical data, retrieving useful information from it and making prediction with availed information. The main objective of this research paper is predicting the presence of the Chronic Kidney Disease(CKD) in patients by using various classification algorithms like Naïve Bayes, Random Forest, Logistic Regression, Linear Discriminant Analysis(LDA) and Quadratic Discriminant Analysis(QDA) and also finding the most predictive and influencing features on target variable by using feature selection technique (Chi-Square test) and Gini impurity index. R tool is used for implementing these classification techniques. Performance of all stated algorithms are being compared by considering accuracy, precision and recall in order to determine the best classifier for predicting the patients having chronic kidney disease

    Other authors
    See publication

Courses

  • Applied Machine Learning

    University of Michigan

  • Applied Text Mining

    University of Michigan

  • CSE 511 - Data Processing at Scale

    CSE - 511

  • CSE 551 - Foundations of Algorithms

    CSE 551

  • CSE 565 - Software Verification Validation & Test

    CSE 565

  • CSE 571 - Artificial Intelligence

    CSE 571

  • CSE 572 - Data Mining

    CSE-572

  • CSE 575 - Statistical Machine Learning

    CSE 575

  • CSE 598 - Deep Learning for Computer Vision

    CSE 598

  • Convolutional Neural Networks

    -

  • Data Science using Python

    University of Michigan

  • Getting and Cleaning Data

    John Hopkins University

  • Improving Deep Neural Networks

    Deeplearning.ai

  • Machine Learning Specialization

    University of Washington

  • Machine Learning: Classification

    University of Washington

  • Machine Learning: Regression

    University of Washington

  • Neural Networks and Deep Learning

    SNFRHANPCF97

  • Sequence Models (RNN)

    JRHUZN8ZXJGD

  • Statistical Inference

    John Hopkins University

  • Statistical Learning

    Stanford University

Projects

  • Custom Deep Models for Phishing Mail Detection focusing on Business Email Compromise

    Business Email Compromise (BEC) is one of the major Phishing mail categories that pertains major threat to the Enterprise Customers. In 2022, BEC led to loss of ~3 Billion dollars in US alone. I have been working on building an efficient solution to detect and prevent BEC attacks originates from Email for Enterprise.

    Focus areas include:
    * Traffic selection
    * Data pipeline for training
    * ML system design
    * Design and Build Scalable solution
    * Deployment
    * Real-time…

    Business Email Compromise (BEC) is one of the major Phishing mail categories that pertains major threat to the Enterprise Customers. In 2022, BEC led to loss of ~3 Billion dollars in US alone. I have been working on building an efficient solution to detect and prevent BEC attacks originates from Email for Enterprise.

    Focus areas include:
    * Traffic selection
    * Data pipeline for training
    * ML system design
    * Design and Build Scalable solution
    * Deployment
    * Real-time Inference

  • Microsoft Security Copilot: Web Grader

    -

    Built and deployed LLM Grader for Web as part of Microsoft Security Copilot

  • Web Defense & Anti Phishing Module – Microsoft Defender for Office

    -

    Building AI solutions for Cybersecurity in M365 product Suite
     Phishing Mail Identification & Generation (Attack Simulation Module)
     Predicted Compromise Rate, Phish Susceptibility Score (User level)
     Web Defense System for Web Protection

  • Crop Classification using multi-spectral Geo-Spatial Data (Satellite/Weather)

    -

    Crop Classification using multi-spectral Geo-Spatial Data (Satellite/Weather) for Azure FarmBeats platform. Identification of crop types early in the growing season aids in many pre-harvest decisions, including monitoring food security, directing the best use of the landscape, supporting agricultural policy, enhancing supply chain efficiency through crop yield forecasting, and estimating crop water needs for irrigation planning. Therefore, there would be substantial advantages if a model could…

    Crop Classification using multi-spectral Geo-Spatial Data (Satellite/Weather) for Azure FarmBeats platform. Identification of crop types early in the growing season aids in many pre-harvest decisions, including monitoring food security, directing the best use of the landscape, supporting agricultural policy, enhancing supply chain efficiency through crop yield forecasting, and estimating crop water needs for irrigation planning. Therefore, there would be substantial advantages if a model could accurately predict crop type early or mid-season. We built a robust Deep Learning Model that can classify the crop types by using temporal patterns emerging from multitemporal satellite images and past year's crop rotation patterns. This model can predict the crop type at the pixel level for a given boundary with latitude/longitude coordinates. CDL data is used in our study to label training samples on satellite images for crop type classification and to track crop rotation patterns from previous years. The greenness and density of the vegetation visible in satellite images are measured using the Normalized Difference Vegetation Index (NDVI), which exhibits different temporal patterns for different crops during a growing season. Crop rotation is used to promote soil health and increase the amount of nutrients available for crops since monocropping, the practice of growing the same crop in the same location for an extended period, gradually depletes the soil quality. To give the model more intelligence, crop rotation signals from prior years are incorporated. A sequence-to-sequence deep learning model is developed utilizing crop rotation signals and temporal inputs to attain an overall accuracy of 90%. The accuracy can be improved by using more temporal data over a growing season. Due to its multimodality, our suggested method can be applied to a variety of scenarios, including early, mid, and late season predictions.

  • Azure FarmBeats: NDVI forecast using satellite imagery and weather data

    -

    Azure FarmBeats is a business-to-business offering available in Azure Marketplace. It enables aggregation of agriculture data sets across providers. Azure FarmBeats enables you to build artificial intelligence (AI) or machine learning (ML) models based on fused data sets. By using Azure FarmBeats, agriculture businesses can focus on core value-adds instead of the undifferentiated heavy lifting of data engineering. As part of FarmBeats, we built an NDVI forecast model using 60 days of historical…

    Azure FarmBeats is a business-to-business offering available in Azure Marketplace. It enables aggregation of agriculture data sets across providers. Azure FarmBeats enables you to build artificial intelligence (AI) or machine learning (ML) models based on fused data sets. By using Azure FarmBeats, agriculture businesses can focus on core value-adds instead of the undifferentiated heavy lifting of data engineering. As part of FarmBeats, we built an NDVI forecast model using 60 days of historical satellite imagery, 30 days of historical weather and 10 days of forecasted weather data. This project demonstrate different capability of Azure FarmBeats and how it can be used effectively for building end-to-end machine learning models for agriculture and food industry.

    See project
  • Data Imputation in geospatial time series data with advanced ML techniques

    -

    Data Imputation in geospatial time series data with deep learning techniques.

  • Smart Search – AI Enabled Scalable and Efficient Context Sensitive Information Retrieval System

    -

    Developing Smart Search which has the capabilities of Key Word search with reduced need of indexing along with Artificial Intelligence that helps in understanding the semantics of user query and find most relevant information. We have developed indigenous Named Entity Recognition and custom function to identify entities from the search text. Key Word search module of Smart Search pre-process the document corpus and transforms it into distributable matrices comprising of numerical scores for…

    Developing Smart Search which has the capabilities of Key Word search with reduced need of indexing along with Artificial Intelligence that helps in understanding the semantics of user query and find most relevant information. We have developed indigenous Named Entity Recognition and custom function to identify entities from the search text. Key Word search module of Smart Search pre-process the document corpus and transforms it into distributable matrices comprising of numerical scores for each words present in each documents. The matrix can be distributed for efficient computation and an index is maintained which explains the word and its column position. Numerical scores can be calculated using many methods such as TF-IDF, count vectorizer, binary vectorizer etc. Once the search text arrives, it is also transformed into a vector of numerical scores corresponding to the number. The relevant documents can be identified by finding the dot product of input vectors and respective vectors from corpus matrix. In this work, various Machine Learning architectures are also experimented on the document corpus to calculate a word embedding for each word based on its context in the documents. Word Embedding represents a word into a vector of numbers where each value can be seen as a representation for different features which explain the word. Embeddings such as Word2Vec and Glove are used and weights are updated from the corpus. The document corpus is also used to learn the proportion of different topics using Latent Dirichlet Allocation algorithm which is one of the popular unsupervised learning method for identifying latent topics from text document. The calculated Word Embedding and LDA are used to understand the semantic similarity of user query and documents. In order to limit the search space, classification models are trained from the dataset. Smart Search encompasses the best capabilities of learning from the data as well as the Key Word search approach.

  • Assurance of Supply (AoS) Analytics

    -

    The AoS Analytics Platform is developed to help users identify and address the drivers of part shortages across the integrated value chain from program start-up through engineering requirements release, purchase order placement, supplier delivery, and manufacturing build. This project focus on building insights and use predictive analytics capability to help reducing part shortages and efficient planning.

  • Future State Architecture Tools -Deep Learning based Named Entity Recognition Model for detecting and categorizing Keywords

    -

    FSAT is an architecture tool which helps users by suggesting the possible future architecture of an application based on the current architecture of the tools. In this project, we created a Deep Learning based NER model with Bidirectional LSTM (Long Short Term Memory) Network backed with Convolutional Neural Networks (character, word and case embedding) for detecting technical keywords from Boeing Propitiatory technical documents and associating it with the respective categories. Model…

    FSAT is an architecture tool which helps users by suggesting the possible future architecture of an application based on the current architecture of the tools. In this project, we created a Deep Learning based NER model with Bidirectional LSTM (Long Short Term Memory) Network backed with Convolutional Neural Networks (character, word and case embedding) for detecting technical keywords from Boeing Propitiatory technical documents and associating it with the respective categories. Model performance was evaluated with f1 score.
    Tools and Technology: TensorFlow, Python, NLP, CNN, LSTM

  • Flight Assembly Line Move - Deep Neural Network Models for Object Detection and Sequence Modeling

    -

    Line Move project focus on identifying different equipment (using camera feed) being used during flight assembly and predicting the possible sequence for better organizing and understanding anomalies.

    In this project, two separate models (Deep Learning - CNN and RNN) for Object Detection and Sequence Modeling. In the Object detection task for identifying various equipment being used in the factory line move of flight assembly, we leveraged few of the state-of-the-art architectures like…

    Line Move project focus on identifying different equipment (using camera feed) being used during flight assembly and predicting the possible sequence for better organizing and understanding anomalies.

    In this project, two separate models (Deep Learning - CNN and RNN) for Object Detection and Sequence Modeling. In the Object detection task for identifying various equipment being used in the factory line move of flight assembly, we leveraged few of the state-of-the-art architectures like YOLO, Fast-RCNN, Mask-RCNN etc and trained it on custom images. After evaluating the performance, Model created with YOLO was selected for deployment. For the Sequence modeling task of identifying equipment sequence, we created a Stacked LSTM network with 200 memory cells and variable length inputs to predict the upcoming equipment requirements.
    Technology Used: Python, TensorFlow, Keras, Convolutional Neural Networks, Recurrent Neural Networks, Sequence to Sequence Models, LSTM etc.

  • Deep Neural based Encoder-Decoder model for Multi-Level Classification Problem

    -

    Created a Deep Neural based Encoder-Decoder sequence model for handling multilevel classification at single shot. In this project, we had input text which consists of customer facing issues which usually end up in multiple queues for getting the final fix. Our aim was to identify the sequence of queues without any manual intervention. We could solve it by using a sequence model with Encoder-Decoder architecture as input and output records have multiple length.

    Tools and Technologies…

    Created a Deep Neural based Encoder-Decoder sequence model for handling multilevel classification at single shot. In this project, we had input text which consists of customer facing issues which usually end up in multiple queues for getting the final fix. Our aim was to identify the sequence of queues without any manual intervention. We could solve it by using a sequence model with Encoder-Decoder architecture as input and output records have multiple length.

    Tools and Technologies Used: Python, Tensorflow, Deep Learning, Sequence Models, Encoder - Decoder architecture, LSTM, GRU etc.

  • Incident Management Auto-response Prediction

    -

    In this Work, we build a multi-phase Machine Learning model which predicts the contents of the email and trigger the response template to the user. We build a first level Multinomial Classification model (SVM, Ensemble Models, Deep Learning etc..) which predicts the high level groups and subsequently employed a Topic Model using Latent Dirichlet Allocation (LDA) to figure out the topic distribution. Tools and Technology: NLTK, Scikit-Learn, Python, Bag-Of-Words, Vectorization, Data Mining…

    In this Work, we build a multi-phase Machine Learning model which predicts the contents of the email and trigger the response template to the user. We build a first level Multinomial Classification model (SVM, Ensemble Models, Deep Learning etc..) which predicts the high level groups and subsequently employed a Topic Model using Latent Dirichlet Allocation (LDA) to figure out the topic distribution. Tools and Technology: NLTK, Scikit-Learn, Python, Bag-Of-Words, Vectorization, Data Mining, Nearest Neighbors, Document Similarity, Machine Learning , Support Vector Machines, Logistic Regression With Regularization, Random Forest, Word Embedding, Semantic Analysis, Deep Learning.

  • Best Resolution Prediction

    -

    In this project, we build a Knowledge Base Resolution prediction by using Vectorization along with Semantic Document Similarity on previous ‘Incident Records’ as well as ‘Knowledge Articles’. For improving the accuracy of prediction, incident and knowledge records are filtered by the configuration item using supervised Machine Learning Classification Models (Deep Learning, CNN, Word Embedding, SVM, Ensemble Models, Logistic Regression, Naive Bayes). Tools and Technology: NLTK, Scikit-Learn…

    In this project, we build a Knowledge Base Resolution prediction by using Vectorization along with Semantic Document Similarity on previous ‘Incident Records’ as well as ‘Knowledge Articles’. For improving the accuracy of prediction, incident and knowledge records are filtered by the configuration item using supervised Machine Learning Classification Models (Deep Learning, CNN, Word Embedding, SVM, Ensemble Models, Logistic Regression, Naive Bayes). Tools and Technology: NLTK, Scikit-Learn, Python, Bag-Of-Words, Vectorization, Data Mining, Nearest Neighbors, Document Similarity, Machine Learning , Support Vector Machines, Logistic Regression With Regularization, Random Forest, Word Embedding, Semantic Analysis, Deep Learning, Keras, TensoFlow.

  • ALDA: An Aggregated LDA for Polarity Enhanced Aspect Identification Technique in Mobile App Domain

    -

    With the increased popularity of the smart mobile devices, mobile applications (a.k.a apps) have become essential. While the app developers face an extensive challenge to improve user satisfaction by exploiting the valuable feedback, the app users are overloaded with way too many apps. Extracting the
    valuable features from apps and mining the associated sentiments is of utmost importance for the app developers. Similarly, from the user perspective, the key preferences should be identified…

    With the increased popularity of the smart mobile devices, mobile applications (a.k.a apps) have become essential. While the app developers face an extensive challenge to improve user satisfaction by exploiting the valuable feedback, the app users are overloaded with way too many apps. Extracting the
    valuable features from apps and mining the associated sentiments is of utmost importance for the app developers. Similarly, from the user perspective, the key preferences should be identified. This work deals with profiling users and apps using a novel LDA based aspect identification technique. Polarity aggregation technique is used to tag the weak features of the apps the developers should concentrate on. The proposed technique has been experimented on an Android review dataset to validate the efficacy compared to state-of-the-art algorithms. Experimental findings suggest superiority and applicability of our model in practical scenarios.

    Tools and Technologies: Latent Dirichlet Allocation (LDA), Semantic Analysis, Python, R, Natural Language Processing, NLTK, Scikit-Learn, Tableau, Machine Learning

  • Aspect Level Sentiment Analysis from Free Form Text

    -

    This project is to identify the respective sentiment distribution across multiple topics from the free form text (feedback/survey/Collaboration tools, Chat data etc.). We built a LDA based topic modeling clubbed with Dictionary as well as ML based sentiment Analysis to aggregate the respective polarity across each topics. The results are used for management decision and fine tuning the respective work areas. For Aspect Identification, we developed Latent Dirichlet Allocation (LDA) model and for…

    This project is to identify the respective sentiment distribution across multiple topics from the free form text (feedback/survey/Collaboration tools, Chat data etc.). We built a LDA based topic modeling clubbed with Dictionary as well as ML based sentiment Analysis to aggregate the respective polarity across each topics. The results are used for management decision and fine tuning the respective work areas. For Aspect Identification, we developed Latent Dirichlet Allocation (LDA) model and for ML based Sentiment Analysis, models using SVM, CNN+ RNN (LSTM), Logistic Regression with L2 Regularization, Naïve Bayes and Random Forest are evaluated. TextBlob is used for Dictionary based Sentiment Analysis
    Tools and Technology: NLTK, Scikit-Learn, Keras, TensorFlow, Python, Bag-Of-Words, Word Embedding, Word2Vec, Vectorization, Machine Learning , Support Vector Machines, Logistic Regression With Regularization, Random Forest, Word Embedding, Semantic Analysis, Deep Learning, Gensim.

  • Predicting the Deviation color Metric for Large Scale Printers

    -

    Data from machines were given and we created a regression model to predict the range and deviation of the outcome variable which is one of the color metric provided given values for all other predictors .For each color, a threshold value for deviation were given. From the predicted deviation, it was used to identify the possible value of deviation which could occur in a particular situation. This data was used to adjust the values of significant predictors. Tools and Technologies: Python, R…

    Data from machines were given and we created a regression model to predict the range and deviation of the outcome variable which is one of the color metric provided given values for all other predictors .For each color, a threshold value for deviation were given. From the predicted deviation, it was used to identify the possible value of deviation which could occur in a particular situation. This data was used to adjust the values of significant predictors. Tools and Technologies: Python, R, Regression Models (Multivariate, Ridge, Lasso, KNN, Tree Models), Classification Models (Logistic Regression, Random Forest), Deep Neural Networks, Scikit-Learn, SQL Developer

  • Survival Analysis

    -

    Analyze the data generated from the machine and predicting the time to the occurrence of next event for the machines. Data which occurs during the time of failure and normal conditions are taken and from that, survival model was created by also considering the problems of censoring, which will to predict the time remaining for the next occurrence of failure. These results were used to plan the maintenance activities of the respective machines. Along with this, respective survival metrics and…

    Analyze the data generated from the machine and predicting the time to the occurrence of next event for the machines. Data which occurs during the time of failure and normal conditions are taken and from that, survival model was created by also considering the problems of censoring, which will to predict the time remaining for the next occurrence of failure. These results were used to plan the maintenance activities of the respective machines. Along with this, respective survival metrics and performance of different models of machines were analyzed. Tools and Technologies: Python, R, Survival Analysis (Kaplen Meier, CoxPH Regression), Censoring, Scikit-Learn, Survival, SQL Developer

Honors & Awards

  • Exceptional Performance Award

    Boeing

    Delivering key machine learning in the areas of BMS Smart Search, AOS Analytics, AIops & GDA - Synthetic Data Generation - Cash Award of $1000.00

  • Speaker - deeplearning.ai

    deeplearning.ai

    Honored to be selected as speaker/panelist for deeplearning.ai meetup organized in India.

  • Boeing Exceptional Performance Award

    Boeing

    For exceptional performance in building Future State Architecture Tool - Named Entity Recognition (Deep Learning) and ChatBot

  • Boeing Exceptional Performance Award

    Boeing

    Recognised for exceptional performance in providing Deep Learning solutions for Boeing's Line Move project

  • Live Wire - Award for Exceptional Performance

    -

    Award for Exceptional Performance - Building Predictive Models for Konica Minolta

  • Spot Award

    -

    Award for exceptional performance in building a high end predictive analytics solution and assisting the deployment

  • Crown - Bravo Award

    -

    Award for exceptional Performance

Languages

  • Malayalam

    Native or bilingual proficiency

  • English

    Native or bilingual proficiency

Recommendations received

7 people have recommended Binil

Join now to view

More activity by Binil

View Binil’s full profile

  • See who you know in common
  • Get introduced
  • Contact Binil directly
Join to view full profile

Other similar profiles

Explore collaborative articles

We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.

Explore More

Add new skills with these courses