Rockman Law’s Post

Building and Investing in AI @ Merantix Capital

6mo

🤖 Creating a fully autonomous AI developer to write and maintain code is still the holy grail of software development. There are many brilliant teams currently working on this, and each of them have their own unique approach. The team at Cosine have made a massive step towards making this a reality. “Cosine achieves state-of-the-art results on the SWE-bench benchmark” They have achieved a score of 43.8% on SWE Bench, which was verified by OpenAI. What's even more incredible is they have managed to do this with a lean team and a relatively limited budget. Though it’s important to note Cosine isn’t looking to replace human developers entirely, but rather augmented them with human-like assistants that they can collaborate with on any kind of coding task. Yang Li + Sam Stenner + Alistair Pullen Read more about their collaboration with OpenAI 👇 https://lnkd.in/gu7xXVPn

Fine-tuning now available for GPT-4o

openai.com

To view or add a comment, sign in

More Relevant Posts

Dr.Deepak Kumar Sahu,PhD

Founder & Managing Director-Kalinga Digital Media Pvt. Ltd. A regular contributor and writes on management and technology trends.
4mo
Report this post
Coding has been the prime jewel of the IT crown, how long they can hold their hegemony, only time will tell. Artificial Intelligence changed it all. The "battle" aspect highlights the rivalry among different platforms, such as GitHub Copilot, OpenAI's Codex, Amazon’s CodeWhisperer, Tab-nine, and others, all of which aim to become the go-to solution for developers and companies. These tools vary in capabilities, supported languages, and features, so they’re often compared against one another based on criteria like accuracy, speed, ease of integration, and overall impact on developer productivity.

AI Coding Tools Battle: Which Tool Will Lead the Future of Coding?

https://www.youtube.com/
Like Comment
To view or add a comment, sign in
Dan Mason
11mo
Report this post
If you read only one thing about AI/LLMs this week, make it this: Hear me out… by Gergely Orosz https://lnkd.in/eFTYDSMA A quick update this week — I am still catching up from vacation — but I loved this take. Gergely writes the excellent Pragmatic Engineer Substack (subscribed!), and his thinking on AI coding is still evolving (see this from a few weeks ago: https://lnkd.in/esDxHMUZ). But this Tweetstorm is right on: coding is possibly the best suited domain for LLMs, because we have so many existing tools and workflows to check code for correctness, style, adherence to patterns, etc. There’s still a lot to do to get the right context into the system, and set up the right checkpoints in process and output, but agentic feedback loops combined with compilers and linters get you 90% of the way there. It’s not a question of if, but exactly how. That said, we are living in an age of lofty promises. Devin and OpenDevin and SWE-Agent (and Stride Conductor) are amazing — with a few tools and some situational awareness, LLMs are able to work with codebases much as a junior engineer would. They can’t yet do everything a smart, experienced human dev can do — and that’s OK! But they’re really well suited for dirty jobs — stuff like tech debt remediation, fixing broken tests, clearing out trivial backlog, etc. Enjoy the weekend!

Gergely Orosz (@GergelyOrosz) on X

twitter.com
Like Comment
To view or add a comment, sign in
CodeMate® AI

1,138 followers
10mo
Report this post
CodeMate #VSCode Extension 2.9.0 released yesterday. Checkout what's new 🤯 - Code Action and Code Lens added for modifying codebase, creating docstrings, inline comments or asking any question within the editor through in-line commands 💯 Added shortcut commands to trigger CodeMate inside editor- - // Codemate: <query>: for asking any questions related to the code - // Generate: <query>: for generating code from natural language within the editor keeping the existing code in context Checkout CodeMate AI now at https://codemate.ai Ayush Singhal Kshitij S. Tyagi #codemateai #softwaredevelopment #AI #Programming
Like Comment
To view or add a comment, sign in
أيمن أحمد

PhD in AI & NLP الدكتوراه في الذكاء الاصطناعي والمعالجة الآلية للغات
5mo
Report this post
Discover the power of #AI in coding! Tools like: Tabnine (AI-driven code suggestions) Codex (AI-generated code snippets) DeepCode (AI-based code review) are revolutionizing the way we code, making development faster, more efficient & accurate #AICoding #FutureOfWork

1 Comment
Like Comment
To view or add a comment, sign in
Joe Clinton

MEng Computer Science Graduate | Aspiring Offline RL Researcher
4mo
Report this post
I am optimistic about LLMs in software engineering and wanted to put this to the test by competing in the first AI competitive coding competition, the NeurIPS HackerCup AI Competition. My team, Matus Lecky and Isaac Ray, competed in the open track of the competition, where we faced the challenge of self-hosting inference and training on a 40GB A100, limiting us to small LLMs. We made it to Round 2, where we came 4th out of 872 participants, however in round 3, the questions got significantly harder and our agent was unable to solve a problem (In our defense, neither did almost any other team including the closed track teams with access to the superior O1 model). Despite the ending, I would still like to briefly describe our strategy because we put considerable effort into it: 🔹 Scaffolding & Pipeline: Careful prompt engineering was essential to getting the small LLM to produce high quality code/reasoning that could be successfully parsed and executed. 🔹 Observations & COT: We generated a pool of observations about the problem and step by step reasoning that we randomly selected from to concatenate to the problem statement. 🔹 Codestral-22B: we tested many models and found this was the best base model that fit comfortably in 40GB. 🔹 Maj@128: We generated 128 code samples, tested them against the sample cases, and applied majority voting if multiple passed. 🔹 Inference speed: Using VLLM for parallel inference and carefully tuned parameters we reached an average output of 2000 tk/s allowing for more tokens per question without exceeding the strict time limit. 🔹 Code improvement: Repeatedly improve the best scoring samples, until they passed. Using these strategies, we enabled Codestral-22B to solve the easier questions of the competition which it previously couldn’t handle in a zero-shot setup. Despite this progress it is clear that open-models are not yet competent competitive coders, but I’m optimistic that with current advancements in LLM reasoning, next year’s competition will show major improvements. We’ll also be open-sourcing our codebase at https://lnkd.in/eC9v9U4N for those interested in building on it or exploring further enhancements. We’re excited to see where this work can lead!

GitHub - Joeclinton1/MapCoder-Hackercup: A modified version of the MapCoder project made for the Neurips 2024 Hackercup Ai track

github.com

3 Comments
Like Comment
To view or add a comment, sign in
Esprit Analytique

524 followers
1mo
Report this post
𝐒𝐦𝐨𝐥 𝐀𝐠𝐞𝐧𝐭𝐬: 𝐓𝐡𝐞 𝐒𝐦𝐚𝐥𝐥 𝐀𝐈 𝐇𝐞𝐫𝐨𝐞𝐬 𝐌𝐚𝐤𝐢𝐧𝐠 𝐚 𝐁𝐢𝐠 𝐃𝐢𝐟𝐟𝐞𝐫𝐞𝐧𝐜𝐞! Hugging Face introduces the Most Straightforward Way to Build AI Agents! Hugging Face has created the simplest framework yet that cuts through the complexity. What makes this special: • Works with ANY language model (OpenAI, Anthropic, or Hugging Face Hub models) • First-class support for Code Agents - not just agents that write code, but agents that execute their actions through code • Seamless Hugging Face Hub integration for sharing and loading tools • Zero unnecessary abstractions - just clean, efficient code SMOL agents are powerful yet beautifully minimal. #AI #Programming #TechInnovation #AIAgents #OpenSource #HuggingFace #DeveloperTools https://lnkd.in/gicmFGdc

smolagents

huggingface.co
Like Comment
To view or add a comment, sign in
Tom Cunniff

Brand Strategy Consultant | B2B Marketing Expert | AI Leader | Clients: SAP, Samsung Ads, Wipro
1mo
Report this post
In just ONE year, AI has gone from struggling with even the simplest bug fixes to successfully handling nearly three-quarters of the set of real-world coding tasks in popular open-source software.

Why AI Progress Is Increasingly Invisible

time.com
Like Comment
To view or add a comment, sign in
Affan Minhas

Computer Engineer | Flutter Developer @Blocship.io | Flutter Flow Developer | Top Rated Upwork
5mo
Report this post
AI powered code reviews CodeRabbit 🤖 As we are heading towards the greatest time of AI we have this tool to review the code of the PR. 💤 I don't know how effective it is as everyone has a different sort of architecture while coding. 🤷 Will you use this tool to review the PR? 🤔 Get your's: https://lnkd.in/dGVYB3Ee #codereviews #coderabbit #githubpr #codereviewai #github #ai
Like Comment
To view or add a comment, sign in
Paul Hennell

PHP Web Developer | Specializing in Laravel, Livewire, Alpine & Tailwind for dynamic web applications in PHP
5mo
Report this post
I've yet to understand how people even get running code with Ai - beyond basic repetitive lines copilot still makes up methods and syntax for me, so I end up debugging longer than it takes to just use the ide auto-completion. Ai coding is amazing when you know what you want, but when it gets ahead of you it's not that different from copying mystery code from stack overflow. #programing #webDevelopment #ai https://lnkd.in/dbP9UfJ3

3 Comments
Like Comment
To view or add a comment, sign in
Manjunatha Rao

Data Science Expert @ SAP | Generative AI | ML | Product Analytics | People Analytics
7mo
Report this post
Tired of Prompt Engineering Hassles? Are you tired of the frustration that comes with crafting complex prompts for large language models (LLMs)? You're not alone. Many developers struggle with the intricacies of prompt engineering, often feeling overwhelmed and stuck. DSPy is designed to simplify your life and empower you to focus on what you love most: building innovative solutions. Imagine a tool that allows you to create and optimize LLM applications effortlessly—this is the promise of DSPy. ** How DSPy Works? DSPy transforms your interaction with LLMs by shifting from manual prompting to a programmatic approach. It automates prompt generation and model optimization, significantly reducing the risk of errors. With DSPy, you can develop adaptive pipelines that learn and improve over time, making your workflow more efficient and effective. ** What You Can Achieve with DSPy? With DSPy, you can streamline your LLM development process and eliminate the hassle of complex prompt crafting. Build efficient applications that dynamically adjust to changing data and requirements, all while joining a vibrant community that continuously evolves and enhances the DSPy framework. Dive into the world of DSPy where you focus on programming, not prompting. Discover how this innovative tool can transform your AI projects. Find the link to DSPy in the comments!
1 Comment
Like Comment
To view or add a comment, sign in

4,114 followers

View Profile Connect

Rockman Law’s Post

Fine-tuning now available for GPT-4o

openai.com

More from this author

Lean into the fear: how to deal with uncertainty & doubt in a startup environment

Choosing the right co-founder may be the best decision you ever make

Why the perfect time to start a cyber security company is now

Explore topics

Rockman Law’s Post

More Relevant Posts

AI Coding Tools Battle: Which Tool Will Lead the Future of Coding?

https://www.youtube.com/

More from this author

Lean into the fear: how to deal with uncertainty & doubt in a startup environment

Choosing the right co-founder may be the best decision you ever make

Why the perfect time to start a cyber security company is now

Explore topics