Manuel Romero

Mula, Región de Murcia, España

4 mil seguidores Más de 500 contactos

Ver tus contactos en común con Manuel

¡Te damos la bienvenida de nuevo!

Email o teléfono

Contraseña

¿Has olvidado tu contraseña?

o

¿Estás empezando a usar LinkedIn? Únete ahora

o

¿Estás empezando a usar LinkedIn? Únete ahora

Únete para seguir

Maisa

Universidad de Murcia

Acerca de

I would say I am a Senior Back-End developer working with the stack: Node.js / Express /…

Artículos de Manuel

Analogía gestión empresa con las batallas en la Edad Antigua

Analogía gestión empresa con las batallas en la Edad Antigua

Por Manuel Romero

7 sept 2017
Inspirational quote in your terminal (test post)

Inspirational quote in your terminal (test post)

Por Manuel Romero

24 jul 2017

Contribuciones

Here's how you can effectively scale your machine learning startup.

So, if you're building an IT team, think of them as the heart of your department. You want people who not only know their stuff technically but also get where you're trying to take the company. Look for folks who are excited about the latest tech trends, like cloud services or cybersecurity, and who work well with others. It's important to keep everyone learning and growing, so maybe set up some workshops or help them get certifications. Mentorship can be a game-changer, too – pairing up the newbies with the experienced pros helps everyone grow faster. And don’t forget, having a mix of different skills and backgrounds in your team can spark some seriously innovative ideas and make your team more adaptable to changes.

Manuel Romero ha contribuido Hace 3 meses Recomendar

Actividad

🔬 Fusión Nuclear en tu habitación con la Ayuda de LLMs 🌟 ¿Alguna vez has pensado en llevar a cabo un proyecto científico ambicioso, como crear un…

🔬 Fusión Nuclear en tu habitación con la Ayuda de LLMs 🌟 ¿Alguna vez has pensado en llevar a cabo un proyecto científico ambicioso, como crear un…

Recomendado por Manuel Romero
I'm sharing a free Colab notebook to finetune Llama 3.1 2x faster for conversational style finetunes with increased accuracy in Unsloth AI! Colab…

I'm sharing a free Colab notebook to finetune Llama 3.1 2x faster for conversational style finetunes with increased accuracy in Unsloth AI! Colab…

Recomendado por Manuel Romero
https://lnkd.in/gA5Hp_bt **Add one line to increase LLM Training throughput by 20% and reduce memory usage by 60%.** Maximizing GPU efficiency for…

https://lnkd.in/gA5Hp_bt **Add one line to increase LLM Training throughput by 20% and reduce memory usage by 60%.** Maximizing GPU efficiency for…

Recomendado por Manuel Romero

Unirse para ver toda la actividad

Experiencia

Maisa

Madrid, Comunidad de Madrid, España
-

Madrid, Comunidad de Madrid, España
-

Madrid, Comunidad de Madrid, España
-
-
-

Mula, Región de Murcia, España
-
-
-
-
-
-

Educación

Universidad de Murcia

2006 - 2011
2004 - 2006

Licencias y certificaciones

Scrum Product Owner

Scrum Alliance

Expedición: jun 2015

Ver credencial
Scrum Master

Scrum Alliance

Expedición: may 2015

Ver credencial
Machine Learning and Deep Learning with Tensorflow

Coursera

Ver credencial
MongoDB Data Modeling

MongoDB

ID de la credencial 2350aba6-b325-456d-96a0-d31d16d

Ver credencial
MongoDB for Node.js developers

MongoDB, Inc.

Ver credencial

Publicaciones

SantaCoder: don’t reach for the stars!

23 de diciembre de 2022

Over the last two years, we have witnessed tremendous progress in the development of code
generating AI assistants (Chen et al., 2021; Chowdhery et al., 2022; Nijkamp et al., 2022;
Fried et al., 2022; Li et al., 2022; Athiwaratkun et al., 2022). Machine learning models are
now capable of assisting professional developers through the synthesis of novel code snippets,
not only from surrounding code fragments, but also from natural language instructions. The
models powering these…

Over the last two years, we have witnessed tremendous progress in the development of code
generating AI assistants (Chen et al., 2021; Chowdhery et al., 2022; Nijkamp et al., 2022;
Fried et al., 2022; Li et al., 2022; Athiwaratkun et al., 2022). Machine learning models are
now capable of assisting professional developers through the synthesis of novel code snippets,
not only from surrounding code fragments, but also from natural language instructions. The
models powering these code completion systems are usually referred to as Large Language
Models for Code—or code LLMs—and are created by training large transformer neural
networks (Vaswani et al., 2017) on big corpora of source code. However, there is a lack of
transparency in the research community on the development of these models due to their
commercial value and the legal uncertainty around distributing training data and models.
Some groups have released model weights (Fried et al., 2022; Nijkamp et al., 2022) or
provided access to the model through a paid API service (Chen et al., 2021; Athiwaratkun
et al., 2022), but these papers did not release the full training data or the preprocessing
methods that were used.

Ver publicación
BERTIN: Efficient Pre-Training of a Spanish Language Model using Perplexity Sampling (2022)

Arxiv

The pre-training of large language models usually requires massive amounts of resources, both in terms of computation and data. Frequently used web sources such as Common Crawl might contain enough noise to make this pre-training sub-optimal. In this work, we experiment with different sampling methods from the Spanish version of mC4, and present a novel data-centric technique which we name perplexity sampling that enables the pre-training of language models in roughly half the amount of steps…

The pre-training of large language models usually requires massive amounts of resources, both in terms of computation and data. Frequently used web sources such as Common Crawl might contain enough noise to make this pre-training sub-optimal. In this work, we experiment with different sampling methods from the Spanish version of mC4, and present a novel data-centric technique which we name perplexity sampling that enables the pre-training of language models in roughly half the amount of steps and using one fifth of the data. The resulting models are comparable to the current state-of-the-art, and even achieve better results for certain tasks. Our work is proof of the versatility of Transformers, and paves the way for small teams to train their models on a limited budget.

Ver publicación
The BigScience Corpus A 1.6TB Composite Multilingual Dataset

-

As language models grow ever larger, the need for large-scale high-quality text datasets has never been more pressing, especially in multilingual settings. The BigScience workshop, a 1-year international and multidisciplinary initiative, was formed with the goal of researching and training large language models as a values-driven undertaking, putting issues of ethics, harm, and governance in the foreground. This paper documents the data creation and curation efforts undertaken by BigScience to…

As language models grow ever larger, the need for large-scale high-quality text datasets has never been more pressing, especially in multilingual settings. The BigScience workshop, a 1-year international and multidisciplinary initiative, was formed with the goal of researching and training large language models as a values-driven undertaking, putting issues of ethics, harm, and governance in the foreground. This paper documents the data creation and curation efforts undertaken by BigScience to assemble a 1.6TB dataset spanning 59 languages that was used to train the 176-billion-parameter BigScience LargeOpen-science Open-access Multilingual language model (BLOOM). We further release a large initial subset of the corpus and analyses thereof, and hope to empower further large-scale monolingual and multilingual modeling projects with both the data and the processing tools, as well as stimulate research into studying this large multilingual corpus.

Ver publicación

Cursos

Machine Learning Crash Course

-

Idiomas

Inglés

-
Francés

-
Castellano

-

Más actividad de Manuel

Ayer finalizamos nuestro curso BioSiP Lab( 4ª edición) de #InteligenciaArtificial: generación de contenido y modelos de lenguaje, impartido en…

Ayer finalizamos nuestro curso BioSiP Lab( 4ª edición) de #InteligenciaArtificial: generación de contenido y modelos de lenguaje, impartido en…

Recomendado por Manuel Romero
Text diffusion can finally generate good text! 📃 We've combed through the dense math of the “Discrete Diffusion Modeling by Estimating the Ratios…

Text diffusion can finally generate good text! 📃 We've combed through the dense math of the “Discrete Diffusion Modeling by Estimating the Ratios…

Recomendado por Manuel Romero
RAG or Fine-Tuning: Which One Should You Use? 🤔 I’ve been thinking a lot about when it makes sense to use Retrieval-Augmented Generation (RAG)…

RAG or Fine-Tuning: Which One Should You Use? 🤔 I’ve been thinking a lot about when it makes sense to use Retrieval-Augmented Generation (RAG)…

Recomendado por Manuel Romero
It works on any subject/topic ! 😳 Who said synthetic data are not easy / useful ? 💥 Search + Save Datasets generated by a LLM in real time 💥 Try…

It works on any subject/topic ! 😳 Who said synthetic data are not easy / useful ? 💥 Search + Save Datasets generated by a LLM in real time 💥 Try…

Recomendado por Manuel Romero
Looking to fine-tune GPT4o? Why not try the open source alternative with Phi 3.5 distilled from GPT4? Finetuning Phi 3.5 is 2x faster and uses 50%…

Looking to fine-tune GPT4o? Why not try the open source alternative with Phi 3.5 distilled from GPT4? Finetuning Phi 3.5 is 2x faster and uses 50%…

Recomendado por Manuel Romero
Super funny to see an official Phi MoE! Phixtral is a little project I made 8 months ago by combining 2-4 finetunes with MergeKit. It worked better…

Super funny to see an official Phi MoE! Phixtral is a little project I made 8 months ago by combining 2-4 finetunes with MergeKit. It worked better…

Recomendado por Manuel Romero
We passed 5 million users. 🥳That's 5 million of you who have signed up on the Hub 🚀 thank you for contributing to the ecosystem and making open…

We passed 5 million users. 🥳That's 5 million of you who have signed up on the Hub 🚀 thank you for contributing to the ecosystem and making open…

Recomendado por Manuel Romero
Comenzamos el curso de #inteligenciaartificial en la Universidad Internacional de Andalucía. Este año tenemos aforo lleno!

Comenzamos el curso de #inteligenciaartificial en la Universidad Internacional de Andalucía. Este año tenemos aforo lleno!

Recomendado por Manuel Romero
It's beautiful to see how far you can get with a 360M model (5x smaller than GPT-2!) with a few tricks! Result: SmolLM 🤏 What it takes: - curate…

It's beautiful to see how far you can get with a 360M model (5x smaller than GPT-2!) with a few tricks! Result: SmolLM 🤏 What it takes: - curate…

Recomendado por Manuel Romero
New: Prompt caching with Claude. Caching lets you instantly fine-tune model responses with longer and more instructive prompts—all while reducing…

New: Prompt caching with Claude. Caching lets you instantly fine-tune model responses with longer and more instructive prompts—all while reducing…

Recomendado por Manuel Romero
Have you noticed the new "model tree" section we've just added on the Hub? Navigate the tree of models finetuned, merged, adapted, quantized from…

Have you noticed the new "model tree" section we've just added on the Hub? Navigate the tree of models finetuned, merged, adapted, quantized from…

Recomendado por Manuel Romero
🚀 ¡Únete al equipo de TensorFlow Colombia! 🚀 ¿Te apasiona la inteligencia artificial y el deep learning? ¿Te gustaría ser parte de una comunidad…

🚀 ¡Únete al equipo de TensorFlow Colombia! 🚀 ¿Te apasiona la inteligencia artificial y el deep learning? ¿Te gustaría ser parte de una comunidad…

Recomendado por Manuel Romero

Ver el perfil completo de Manuel

Descubrir a quién conocéis en común
Conseguir una presentación
Contactar con Manuel directamente

Unirse para ver el perfil completo

Perfiles similares

Otras personas con el nombre de Manuel Romero en España

Hay 3151 personas más con el nombre de Manuel Romero en España en LinkedIn

Ver a otras personas con el nombre de Manuel Romero

Añade nuevas aptitudes con estos cursos

Ver todos los cursos

Manuel Romero

Mula, Región de Murcia, España 4 mil seguidores Más de 500 contactos

Acerca de

Artículos de Manuel

Analogía gestión empresa con las batallas en la Edad Antigua

Por Manuel Romero

Inspirational quote in your terminal (test post)

Por Manuel Romero

Contribuciones

Actividad

🔬 Fusión Nuclear en tu habitación con la Ayuda de LLMs 🌟 ¿Alguna vez has pensado en llevar a cabo un proyecto científico ambicioso, como crear un…

Recomendado por Manuel Romero

I'm sharing a free Colab notebook to finetune Llama 3.1 2x faster for conversational style finetunes with increased accuracy in Unsloth AI! Colab…

Recomendado por Manuel Romero

https://lnkd.in/gA5Hp_bt **Add one line to increase LLM Training throughput by 20% and reduce memory usage by 60%.** Maximizing GPU efficiency for…

Recomendado por Manuel Romero

Experiencia

-

-

-

-

-

-

-

-

-

-

-

Educación

Universidad de Murcia

Licencias y certificaciones

Publicaciones

23 de diciembre de 2022

Arxiv

-

Cursos

Machine Learning Crash Course

-

Idiomas

Inglés

-

Francés

-

Castellano

-

Más actividad de Manuel

Ayer finalizamos nuestro curso BioSiP Lab( 4ª edición) de #InteligenciaArtificial: generación de contenido y modelos de lenguaje, impartido en…

Recomendado por Manuel Romero

Text diffusion can finally generate good text! 📃 We've combed through the dense math of the “Discrete Diffusion Modeling by Estimating the Ratios…

Recomendado por Manuel Romero

RAG or Fine-Tuning: Which One Should You Use? 🤔 I’ve been thinking a lot about when it makes sense to use Retrieval-Augmented Generation (RAG)…

Recomendado por Manuel Romero

It works on any subject/topic ! 😳 Who said synthetic data are not easy / useful ? 💥 Search + Save Datasets generated by a LLM in real time 💥 Try…

Recomendado por Manuel Romero

Looking to fine-tune GPT4o? Why not try the open source alternative with Phi 3.5 distilled from GPT4? Finetuning Phi 3.5 is 2x faster and uses 50%…

Recomendado por Manuel Romero

Super funny to see an official Phi MoE! Phixtral is a little project I made 8 months ago by combining 2-4 finetunes with MergeKit. It worked better…

Recomendado por Manuel Romero

We passed 5 million users. 🥳That's 5 million of you who have signed up on the Hub 🚀 thank you for contributing to the ecosystem and making open…

Recomendado por Manuel Romero

Comenzamos el curso de #inteligenciaartificial en la Universidad Internacional de Andalucía. Este año tenemos aforo lleno!

Recomendado por Manuel Romero

It's beautiful to see how far you can get with a 360M model (5x smaller than GPT-2!) with a few tricks! Result: SmolLM 🤏 What it takes: - curate…

Recomendado por Manuel Romero

New: Prompt caching with Claude. Caching lets you instantly fine-tune model responses with longer and more instructive prompts—all while reducing…

Recomendado por Manuel Romero

Have you noticed the new "model tree" section we've just added on the Hub? Navigate the tree of models finetuned, merged, adapted, quantized from…

Recomendado por Manuel Romero

🚀 ¡Únete al equipo de TensorFlow Colombia! 🚀 ¿Te apasiona la inteligencia artificial y el deep learning? ¿Te gustaría ser parte de una comunidad…

Recomendado por Manuel Romero

Ver el perfil completo de Manuel

Perfiles similares

David Martínez Labuiga

Sergi Garcés López

Miguel Jimenez Benajes 👨🏼‍💻🚀

Adrián Ruiz Lopez

Sergio González Aguadero

Victor Velez Requena

Borja Atienza Membrive

Daniel Pineda Sánchez

Mula, Región de Murcia, España

4 mil seguidores Más de 500 contactos

https://lnkd.in/gA5Hp_bt Add one line to increase LLM Training throughput by 20% and reduce memory usage by 60%. Maximizing GPU efficiency for…