Kalyan KS

Tirupati Urban, Andhra Pradesh, India

28K followers 500+ connections

View mutual connections with Kalyan

Welcome back

Email or phone

Password

Forgot password?

or

New to LinkedIn? Join now

or

New to LinkedIn? Join now

Articles by Kalyan

Top LLM Papers of the Week (August Week 4, 2024)

Top LLM Papers of the Week (August Week 4, 2024)

By Kalyan KS

Sep 1, 2024
Top LLM Papers of the Week (August Week 3, 2024)

Top LLM Papers of the Week (August Week 3, 2024)

By Kalyan KS

Aug 28, 2024
Top RAG Papers of the Week (August Week 3, 2024)

Top RAG Papers of the Week (August Week 3, 2024)

By Kalyan KS

Aug 26, 2024

See all articles

Activity

A well-researched article by Lovely Majumdar on NIRF and Scientific Misconduct in India. It includes analysis by IRW as well as quotes from our…

A well-researched article by Lovely Majumdar on NIRF and Scientific Misconduct in India. It includes analysis by IRW as well as quotes from our…

Liked by Kalyan KS
Serve Llama 8B with OpenAI API format just with few lines of code using LitServe 🚀 Learn more 👉🏻 https://lnkd.in/eF7aAVf7

Serve Llama 8B with OpenAI API format just with few lines of code using LitServe 🚀 Learn more 👉🏻 https://lnkd.in/eF7aAVf7

Liked by Kalyan KS
hi all Today was my last day at Cerence Inc. . I am grateful for all the learning experience here and wonderful people I met. I am open to new…

hi all Today was my last day at Cerence Inc. . I am grateful for all the learning experience here and wonderful people I met. I am open to new…

Liked by Kalyan KS

Join now to see all activity

Publications

AMMU: A survey of transformer-based biomedical pretrained language models

Journal of Biomedical Informatics December 31, 2021

provides a comprehensive survey of various transformer-based biomedical pretrained language models.

See publication
BertMCN: Mapping colloquial phrases to standard medical concepts using BERT and highway network

Artificial Intelligence in Medicine January 31, 2021

In the last few years, people started to share lots of information related to health in the form of tweets, reviews and blog posts. All these user generated clinical texts can be mined to generate useful insights. However, automatic analysis of clinical text requires identification of standard medical concepts. Most of the existing deep learning based medical concept normalization systems are based on CNN or RNN. Performance of these models is limited as they have to be trained from scratch…

In the last few years, people started to share lots of information related to health in the form of tweets, reviews and blog posts. All these user generated clinical texts can be mined to generate useful insights. However, automatic analysis of clinical text requires identification of standard medical concepts. Most of the existing deep learning based medical concept normalization systems are based on CNN or RNN. Performance of these models is limited as they have to be trained from scratch (except embeddings). In this work, we propose a medical concept normalization system based on BERT and highway layer. BERT, a pre-trained context sensitive deep language representation model advanced state-of-the-art performance in many NLP tasks and gating mechanism in highway layer helps the model to choose only important information. Experimental results show that our model outperformed all existing methods on two standard datasets. Further, we conduct a series of experiments to study the impact of different learning rates and batch sizes, noise and freezing encoder layers on our model.

See publication
Social Media Medical Concept Normalization using RoBERTa in Ontology Enriched Text Similarity Framework

KNLP Workshop @AACL-IJCNLP 2020 December 6, 2020

Pattisapu et al. (2020) formulate medical concept normalization (MCN) as text similarity problem and propose a model based on RoBERTa and graph embedding based target concept vectors. However, graph embedding techniques ignore valuable information available in the clinical ontology like concept description and synonyms. In this work, we enhance the model of Pattisapu et al. (2020) with two novel changes. First, we use retrofitted target concept vectors instead of graph embedding based vectors…

Pattisapu et al. (2020) formulate medical concept normalization (MCN) as text similarity problem and propose a model based on RoBERTa and graph embedding based target concept vectors. However, graph embedding techniques ignore valuable information available in the clinical ontology like concept description and synonyms. In this work, we enhance the model of Pattisapu et al. (2020) with two novel changes. First, we use retrofitted target concept vectors instead of graph embedding based vectors. It is the first work to leverage both concept description and synonyms to represent concepts in the form of retrofitted target concept vectors in text similarity framework based social media MCN. Second, we generate both concept and concept mention vectors with same size which eliminates the need of dense layers to project concept mention vectors into the target concept embedding space. Our model outperforms existing methods with improvements up to 3.75% on two standard datasets. Further when trained only on mapping lexicon synonyms, our model outperforms existing methods with significant improvements up to 14.61%. We attribute these significant improvements to the two novel changes introduced.

See publication
Want to Identify, Extract and Normalize Adverse Drug Reactions in Tweets? Use RoBERTa

SMM4H Workshop Shared Task @COLING 2020 December 1, 2020

This paper presents our approach for task 2 and task 3 of Social Media Mining for Health (SMM4H) 2020 shared tasks. In task 2, we have to differentiate adverse drug reaction (ADR) tweets from nonADR tweets and is treated as binary classification. Task 3 involves extracting ADR mentions and then mapping them to MedDRA codes. Extracting ADR mentions is treated as sequence labeling and normalizing ADR mentions is treated as multi-class classification. Our system is based on pre-trained language…

This paper presents our approach for task 2 and task 3 of Social Media Mining for Health (SMM4H) 2020 shared tasks. In task 2, we have to differentiate adverse drug reaction (ADR) tweets from nonADR tweets and is treated as binary classification. Task 3 involves extracting ADR mentions and then mapping them to MedDRA codes. Extracting ADR mentions is treated as sequence labeling and normalizing ADR mentions is treated as multi-class classification. Our system is based on pre-trained language model RoBERTa and it achieves a) F1-score of 58% in task 2 which is 12% more than the average score b) relaxed F1-score of 70.1% in ADR extraction of task 3 which is 13.7% more than the average score and relaxed F1-score of 35% in ADR extraction + normalization of task 3 which is 5.8% more than the average score. Overall, our models achieve promising results in both the tasks with significant improvements over average scores.

See publication
Medical Concept Normalization in User-Generated Texts by Learning Target Concept Embeddings

LOUHI Workshop @EMNLP 2020 November 10, 2020

Medical concept normalization helps in discovering standard concepts in free-form text i.e., maps health-related mentions to standard concepts in a clinical knowledge base. It is much beyond simple string matching and requires a deep semantic understanding of concept mentions. Recent research approach concept normalization as either text classification or text similarity. The main drawback in existing a) text classification approach is ignoring valuable target concepts information in learning…

Medical concept normalization helps in discovering standard concepts in free-form text i.e., maps health-related mentions to standard concepts in a clinical knowledge base. It is much beyond simple string matching and requires a deep semantic understanding of concept mentions. Recent research approach concept normalization as either text classification or text similarity. The main drawback in existing a) text classification approach is ignoring valuable target concepts information in learning input concept mention representation b) text similarity approach is the need to separately generate target concept embeddings which is time and resource consuming. Our proposed model overcomes these drawbacks by jointly learning the representations of input concept mention and target concepts. First, we learn input concept mention representation using RoBERTa. Second, we find cosine similarity between embeddings of input concept mention and all the target concepts. Here, embeddings of target concepts are randomly initialized and then updated during training. Finally, the target concept with maximum cosine similarity is assigned to the input concept mention. Our model surpasses all the existing methods across three standard datasets by improving accuracy up to 2.31%.

See publication
Target Concept Guided Medical Concept Normalization in Noisy User-Generated Texts

DeeLIO Workshop @EMNLP 2020 November 10, 2020

Medical concept normalization (MCN) i.e., mapping of colloquial medical phrases to standard concepts is an essential step in analysis of medical social media text. The main drawback in existing state-of-the-art approach (Kalyan and Sangeetha, 2020b) is learning target concept vector representations from scratch which requires more training instances. Our model is based on RoBERTa and target concept embeddings. In our model, we integrate a) target concept information in the form of target…

Medical concept normalization (MCN) i.e., mapping of colloquial medical phrases to standard concepts is an essential step in analysis of medical social media text. The main drawback in existing state-of-the-art approach (Kalyan and Sangeetha, 2020b) is learning target concept vector representations from scratch which requires more training instances. Our model is based on RoBERTa and target concept embeddings. In our model, we integrate a) target concept information in the form of target concept vectors generated by encoding target concept descriptions using SRoBERTa, state-of-the-art RoBERTa based sentence embedding model and b) domain lexicon knowledge by enriching target concept vectors with synonym relationship knowledge using retrofitting algorithm. It is the first attempt in MCN to exploit both target concept information as well as domain lexicon knowledge in the form of retrofitted target concept vectors. Our model outperforms all the existing models with an accuracy improvement up to 1.36% on three standard datasets. Further, our model when trained only on mapping lexicon synonyms achieves up to 4.87% improvement in accuracy

See publication
SECNLP: A survey of embeddings in clinical natural language processing

Journal of Biomedical Informatics November 8, 2019

Distributed vector representations or embeddings map variable length text to dense fixed length vectors as well as capture prior knowledge which can transferred to downstream tasks. Even though embeddings have become de facto standard for text representation in deep learning based NLP tasks in both general and clinical domains, there is no survey paper which presents a detailed review of embeddings in Clinical Natural Language Processing. In this survey paper, we discuss various medical corpora…

Distributed vector representations or embeddings map variable length text to dense fixed length vectors as well as capture prior knowledge which can transferred to downstream tasks. Even though embeddings have become de facto standard for text representation in deep learning based NLP tasks in both general and clinical domains, there is no survey paper which presents a detailed review of embeddings in Clinical Natural Language Processing. In this survey paper, we discuss various medical corpora and their characteristics, medical codes and present a brief overview as well as comparison of popular embeddings models. We classify clinical embeddings and discuss each embedding type in detail. We discuss various evaluation methods followed by possible solutions to various challenges in clinical embeddings. Finally, we conclude with some of the future directions which will advance research in clinical embeddings.

See publication

Honors & Awards

Gold Medalist (MSc Computer Science, 2015-17 batch, NIT Trichy)

NIT Trichy

Aug 2017

Highest CGPA in MSc Computer Science (2015-2017)

More activity by Kalyan

This initiative is more beneficial to the founders of Vizuara to earn more money and less beneficial to the students. AI research requires a lot of…

This initiative is more beneficial to the founders of Vizuara to earn more money and less beneficial to the students. AI research requires a lot of…

Liked by Kalyan KS
This initiative is more beneficial to the founders of Vizuara to earn more money and less beneficial to the students. AI research requires a lot of…

This initiative is more beneficial to the founders of Vizuara to earn more money and less beneficial to the students. AI research requires a lot of…

Shared by Kalyan KS
If you are still using closed source API's you should read this article.

If you are still using closed source API's you should read this article.

Liked by Kalyan KS
This is getting out of hand. School kids are now being made to publish in International Journals. Raj Abhijit Dandekar Rajat Dandekar , we understand…

This is getting out of hand. School kids are now being made to publish in International Journals. Raj Abhijit Dandekar Rajat Dandekar , we understand…

Liked by Kalyan KS
2 dark secrets about Kissan Ketchup EXPOSED! Kissan FRESH tomato ketchup is not FRESH. They have trademarked the word "fresh". If you turn around…

2 dark secrets about Kissan Ketchup EXPOSED! Kissan FRESH tomato ketchup is not FRESH. They have trademarked the word "fresh". If you turn around…

Liked by Kalyan KS
𝐀𝐮𝐭𝐨𝐆𝐞𝐧 𝐒𝐭𝐮𝐝𝐢𝐨 - 𝐍𝐨 𝐂𝐨𝐝𝐞 𝐓𝐨𝐨𝐥 𝐟𝐨𝐫 𝐁𝐮𝐢𝐥𝐝𝐢𝐧𝐠 𝐀𝐈 𝐀𝐠𝐞𝐧𝐭𝐬 𝐌𝐮𝐥𝐭𝐢-𝐀𝐠𝐞𝐧𝐭 𝐒𝐲𝐬𝐭𝐞𝐦𝐬 Multi-Agent…

𝐀𝐮𝐭𝐨𝐆𝐞𝐧 𝐒𝐭𝐮𝐝𝐢𝐨 - 𝐍𝐨 𝐂𝐨𝐝𝐞 𝐓𝐨𝐨𝐥 𝐟𝐨𝐫 𝐁𝐮𝐢𝐥𝐝𝐢𝐧𝐠 𝐀𝐈 𝐀𝐠𝐞𝐧𝐭𝐬 𝐌𝐮𝐥𝐭𝐢-𝐀𝐠𝐞𝐧𝐭 𝐒𝐲𝐬𝐭𝐞𝐦𝐬 Multi-Agent…

Liked by Kalyan KS
𝐀𝐮𝐭𝐨𝐆𝐞𝐧 𝐒𝐭𝐮𝐝𝐢𝐨 - 𝐍𝐨 𝐂𝐨𝐝𝐞 𝐓𝐨𝐨𝐥 𝐟𝐨𝐫 𝐁𝐮𝐢𝐥𝐝𝐢𝐧𝐠 𝐀𝐈 𝐀𝐠𝐞𝐧𝐭𝐬 𝐌𝐮𝐥𝐭𝐢-𝐀𝐠𝐞𝐧𝐭 𝐒𝐲𝐬𝐭𝐞𝐦𝐬 Multi-Agent…

𝐀𝐮𝐭𝐨𝐆𝐞𝐧 𝐒𝐭𝐮𝐝𝐢𝐨 - 𝐍𝐨 𝐂𝐨𝐝𝐞 𝐓𝐨𝐨𝐥 𝐟𝐨𝐫 𝐁𝐮𝐢𝐥𝐝𝐢𝐧𝐠 𝐀𝐈 𝐀𝐠𝐞𝐧𝐭𝐬 𝐌𝐮𝐥𝐭𝐢-𝐀𝐠𝐞𝐧𝐭 𝐒𝐲𝐬𝐭𝐞𝐦𝐬 Multi-Agent…

Shared by Kalyan KS
Explored RAG from Scratch 💡 🛠️ ▪ Retrieval-Augmented Generation (RAG) is a significant advancement in text generation and information retrieval…

Explored RAG from Scratch 💡 🛠️ ▪ Retrieval-Augmented Generation (RAG) is a significant advancement in text generation and information retrieval…

Liked by Kalyan KS
🚀 Speed up your existing model serving >8x by utilizing dynamic batching and autoscaling. 💫 LitServe makes it trivial to serve machine learning…

🚀 Speed up your existing model serving >8x by utilizing dynamic batching and autoscaling. 💫 LitServe makes it trivial to serve machine learning…

Liked by Kalyan KS
While the world moves closer to gender equality in education, India faces a concerning setback. Ranked 129th out of 146 countries by the World…

While the world moves closer to gender equality in education, India faces a concerning setback. Ranked 129th out of 146 countries by the World…

Liked by Kalyan KS
India has tremendous research potential. We just need to orient people in the right direction. Instead of churning out bogus papers, if people work…

India has tremendous research potential. We just need to orient people in the right direction. Instead of churning out bogus papers, if people work…

Liked by Kalyan KS
An engineer, an IAS officer, and now a world-class para-badminton player, Suhas Yathiraj’s journey defies expectations. Born with an ankle…

An engineer, an IAS officer, and now a world-class para-badminton player, Suhas Yathiraj’s journey defies expectations. Born with an ankle…

Liked by Kalyan KS

View Kalyan’s full profile

See who you know in common
Get introduced
Contact Kalyan directly

Join to view full profile

Other similar profiles

Explore collaborative articles

We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.

Others named Kalyan KS in India

8 others named Kalyan KS in India are on LinkedIn

See others named Kalyan KS

Add new skills with these courses

See all courses