Freeplay

Freeplay

Software Development

Boulder, Colorado 707 followers

A better way to build with LLMs. Prompt engineering, testing & evaluation tools for your whole team.

About us

A better way to build with LLMs. Bridge the gap between domain experts & developers. Prompt engineering, testing & evaluation tools for your whole team. Now in private beta.

Website
http://freeplay.ai
Industry
Software Development
Company size
2-10 employees
Headquarters
Boulder, Colorado
Type
Privately Held
Founded
2022
Specialties
Artificial Intelligence and Developer Tools

Locations

Employees at Freeplay

Updates

  • View organization page for Freeplay, graphic

    707 followers

    Colorado's AI scene is on 🔥🔥🔥 See it for yourself on Wednesday, Sept 18th at the AI Builders meetup in Denver! We've partnered with Denver Startup Week and DenAI Summit to bring our Boulder meetup to Denver and showcase the best AI products being built in Colorado. Come see hand-picked demos and network with 300+ other builders! Shout out to our co-hosts Ombud and Matchstick Ventures! There will also be light bites and drinks thanks to our good friends and partners at SVB and Technical Integrity! Register to join https://lnkd.in/g8HrKA2p

    RSVP to AI Builders Meetup - Denver Startup Week Edition | Partiful

    RSVP to AI Builders Meetup - Denver Startup Week Edition | Partiful

    partiful.com

  • View organization page for Freeplay, graphic

    707 followers

    🔥 5 killer demos from Google, Nvidia, and local startups 🎉 1 jam packed space The AI Builders meetup in Boulder never disappoints! Last night’s demos were awesome as always. 🤯 Here’s the recap for those who missed it: * Jane Fine showed us what her team is up to at Google Code Labs solving data science problems with agents * Nathan S. Robinson demo’d his company Plotzy and their use of agentic AI and RAG on massive commercial real estate data sets * Anneliese Niebauer showed how ShowStop to automate video production, end to end using existing company content * Matt Bruehl rolled in from NVIDIA to demo their NIM serverless endpoints on Hugging Face 🤗 * Nadia Eldeib closed out the night by showing us how CodeYam can simulate software for pull requests using AI We’re consistently blown away by the quality of what's being built in Colorado. And we’re deeply grateful to our partners who make it possible each month! Huge thanks to Kiln, SVB, and Technical Integrity. Ready for the next meetup? September will be in Denver for Denver Startup Week! RSVP link in the comments. 🙌

    • No alternative text description for this image
    • No alternative text description for this image
    • No alternative text description for this image
    • No alternative text description for this image
  • View organization page for Freeplay, graphic

    707 followers

    We made a BIG update to our Eval Alignment workflow in Freeplay. It's now *radically* easier to create, tune & deploy model-graded evals for AI products (aka "LLM-as-a-judge"). 🦾 We're on a mission to help people build great AI products. At the core, that means having a system that catches issues and helps you improve them over time. Now anyone on a product development team (engineers of course, but also PMs, Designers, or SMEs who aren't writing code) can: * Spot an issue in a production application * Create a new eval (from scratch, or using a template for common evals) * Automatically catch that issue in the future, kind of like a feature test And it's seamless to deploy a new eval for production monitoring as well as offline tests & experiments. Why this helps? To make a high-quality model-graded eval, you need the following: * A prompt & model that generate a score * A benchmark dataset that is sufficiently large enough to represent real-world customer scenarios, and that includes a balanced distribution of scores for the eval  * Ground truth labels on that benchmark dataset from reliable human experts, so you know how a human would expect each example in a dataset to be scored Given that it can be a struggle to even start writing a good eval prompt, the work required to do the rest has been so onerous that many teams postpone it — even when they know they want and need good evals. We’re changing that! Our new Eval Alignment workflow lets people: 1. Create an eval prompt 2. Automatically generate a benchmark dataset from real data 3. Test a new eval prompt & create ground truth labels at the same time 4. Iterate as needed (with built-in versioning) 5. Grow the dataset & strengthen the eval automatically as they use Freeplay The video shows how it works. We'd love to talk more if you're thinking about this problem -- more coming on this front soon. 🛠️

  • View organization page for Freeplay, graphic

    707 followers

    Great deep dive on the current state of AI product evals from Daniel Porras Reyes at Flybridge, give it a read. ✨ It covers a lot of the topics that matter most to building generative AI products: 💊 Why building a good eval system for an AI product is essential 📚 The different types of evals and how they combine in practice 🔁 How evals help teams build a better feedback loop to continuously improve Our CEO Ian Cairns was quoted in the article, talking about the importance of getting everyone on a software team working together to improve evals and product quality (not just engineers). We've seen this approach make a big impact for our customers: "We’ve found that in practice, product development teams build up an eval suite over time, and it’s often a multi-disciplinary process. There’s a feedback loop on evals themselves, where teams start with vibe checks, and then realize what the important underlying criteria are that they’re assessing with those ‘vibes.’ Giving the whole team including PMs, QA and other domain experts an easy way to look at data, label it with existing criteria, and suggest new evals gets that flywheel going.” Check it out below. 👇

    View profile for Daniel Porras Reyes, graphic

    @Flybridge | Venture Capital | Data and enablement platforms for AI builders | Creator AI Index

    In Gen AI, experimentation is key to success. To build continuously improving systems, evaluations must be a central pillar—after all, you can't improve what you don't measure. Evaluation systems should be tailored to each company and adaptable based on real user interactions. In the latest Flybridge article, we deep dive into the characteristics of a good evaluation system, types of evaluations, and human vs. AI-based model evaluations. We also discuss current areas of research in the space and the strategic role evaluations have for generative AI systems beyond reliability. Massive thanks to Julia Neagu from Quotient AI, Ian Cairns from Freeplay, Yi Zhang from Relari, and Zhao Chen for the insights, feedback, and quotes for the article. As always, thanks to my colleagues Chip Hazard and Jeffrey Bussgang for their contributions. https://lnkd.in/emndRkXt

    Scaling the Vibes: Gen AI systems evaluations — Flybridge

    Scaling the Vibes: Gen AI systems evaluations — Flybridge

    flybridge.com

  • Freeplay reposted this

    Curious what the constant drop in latency will mean for building products with LLMs? Some thoughts from Jeremy Silva on our AI Engineering team. 🙌

  • View organization page for Freeplay, graphic

    707 followers

  • View organization page for Freeplay, graphic

    707 followers

    We’re beyond excited to announce that Nina Stepanov has joined the team at Freeplay as our Product & Growth Marketing Lead! Freeplay’s providing a meaningful impact for our early customers. Nina is joining as our first full-time GTM hire to help us scale up access and introduce the product to others. Nina was previously on the growth team at Alloy, where she worked on building top of funnel demand. Prior to Alloy, Nina was a marketer at industry leaders like HubSpot and Intuit, as well as early stage startups like Embedly (YC W10, acquired by Medium) and ViewPoint Cloud (acquired by OpenGov). She also spent time as a venture investor — backing and advising pre-seed and seed-stage B2B SaaS startups on their day 1 marketing and sales strategies. Nina’s been in the early-stage tech ecosystem for a long time. We’re excited to have her on the team to help shepherd Freeplay into the hearts, minds, and code bases of product development teams building with generative AI. Curious to partner with us? We’d love to talk. Shoot Nina an email [email protected] and follow us to stay up to speed with the latest on building products with generative AI.

    • Nina Stepanov joins Freeplay as first GTM hire
  • Freeplay reposted this

    View profile for James Le, graphic

    I help AI infrastructure startups cross the chasm

    I am stoked to represent Twelve Labs with Sean Barclay and Haram Jo at the AI Tinkerers Denver Hackathon next weekend: https://lnkd.in/gGDyQt36 The organizing team has been working hard to make this happen Junaid Dawud Jordan Watkins 💪 The judges are stellar Alex Volkov Joshua Rubin Austin V. Daniel Ritchie 👏 It's going to be a good weekend 😊

    View organization page for Focused Labs, graphic

    2,452 followers

    We’re teaming up with our friends from Twelve Labs, Groq, and AI Tinkerers to host Denver’s first multimodal Hackathon. Join us for all things AI-driven development and get to know the incredible Denver Development community! Not to mention, you can win some pretty cool prizes while you’re at it! #Hackathon #DenverDevelopers #DenverHackathon #Aidrivendevelopment 

    This content isn’t available here

    Access this content and more in the LinkedIn app

Similar pages

Funding