How GitHub harnesses AI to transform customer feedback into action
Learn how we’re experimenting with open source AI models to systematically incorporate customer feedback to supercharge our product roadmaps.
In today’s rapidly evolving tech landscape, the buzz around “generative AI” is impossible to ignore. It’s everywhere—on TV and social media, in productivity tools, at conferences, in our phones, you name it. The hype is real and what excites us at GitHub is the transformative potential of AI that we’re just beginning to unlock.
At GitHub, our primary goal is to continuously improve our platform to better serve our beloved developer community. We receive countless pieces of feedback through our support portal every day. The sheer volume of feedback we receive can be daunting. Despite our best efforts, manually sifting through all of that text data is an overwhelming challenge that results in a lot of untapped opportunities. Manual data classification and analysis by humans is error-prone and very time-consuming, as it often involves handling vast amounts of data, leading to fatigue and inconsistency. A Harvard Business Review study reveals that data scientists spend about 80% of their time on tasks like data collection and organization, including manual classification, which impedes efficiency and delays the discovery of valuable insights. This inefficiency is driving a shift toward automated systems that offer greater accuracy and speed, moving away from traditional analytics methods.
This challenge drove us to combine powerful data-mining techniques with machine learning algorithms to extract, interpret, and analyze customer feedback at scale. By transforming customer feedback into actionable insights through advanced AI analytics, we are able to advance our products and reinforce our commitment to user trust, ensuring that every voice is heard and valued.
Amplifying developer voices with AI
When I joined GitHub’s Customer Success Engineering team as a program manager, I started working closely with multiple product and engineering teams to provide actionable insights on product performance. One question remained constant during my tenure at GitHub: what are the main pain points customers are experiencing with our products? Even though it seems an easy question to answer, I found it very difficult to distill down the vast amount of data and ensure I was highlighting the right opportunities to improve customer experience. After reading hundreds of support tickets, I was driven by a focused mission: finding a way to honor the insights and feedback we receive from our vast user base and—in particular—let their voices help guide us as we prioritize the development of new features.
Although I have a passion for analytics, and I helped build multiple internal tools in the past, I knew this wouldn’t be an easy task due to the complexity of customer feedback text hidden in support tickets. I turned to my colleague, Steven Solomon, a staff software engineer with extensive experience, to explore potential solutions. Eventually, inspiration struck us: what if we could leverage the power of AI to systematically analyze and interpret our developer community’s feedback?
We then began to explore the market for AI-driven analytics solutions, but we quickly realized that we needed a tool that adhered to strict security and privacy regulations and incorporated tailored business metrics to be able to tell a compelling story to our product teams. We were inspired by the idea that “being able to visualize data and tell stories with it is key to turning it into information that can be used to drive better decision making” (Storytelling with Data). Motivated by this principle, we assembled a team of software engineers who shared the same mission and passion to create a unique internal AI analytics tool that presents the most relevant and actionable trends, complete with business context specifically tailored to GitHub’s product areas.
Experimenting with open source models
As the world’s largest open source code ecosystem, our journey with AI-driven analytics started by looking into open-source AI models hosted in our platform, including BERTopic. BERTopic is an open source topic modeling framework that leverages Bidirectional Encoder Representations from Transformers (BERT) embeddings to create dynamic and interpretable topics. BERT language model is an open source machine learning framework for natural language processing (NLP). BERT is designed to help computers understand the meaning of ambiguous language in text by using surrounding text to establish context. BERTopic combines BERT’s ability to generate high-quality document embeddings with a clustering algorithm, typically Hierarchical Density-Based Spatial Clustering of Applications with Noise, (HDBSCAN), to group similar documents together. The topics are then derived by extracting and aggregating the most representative words from each cluster.
One of the standout capabilities of BERT is its ability to understand and process multiple languages. This multilingual capability stems from its training on diverse datasets that include text in various languages. As a result, BERTopic can effectively analyze feedback from our global user base, identifying themes and issues regardless of the language in which the feedback is provided. This multilingual proficiency ensures that we capture a comprehensive picture of our user feedback, allowing us to more effectively address the needs of our international community.
One key aspect to highlight is that we don’t train any models with customer feedback from support tickets. Instead, we apply the pre-trained models to analyze the feedback text data and generate the insights.
The representative words generated by this model kick-started our project, but there was still a key piece missing—the outputs needed to be easily understandable by humans. A group of words is different from a full representative sentence of customer pain. This led to the next phase of development: summarizing those clusters into actionable insights.
Summarizing insights with GPT-4
To ensure we could display the customer feedback insights in a more comprehensible form, we decided to summarize them using GPT-4, a powerful and popular large language model (LLM).
GPT-4 is particularly effective at summarizing topic clusters because of its advanced natural language processing capabilities. The model can also be optimized to better understand the specific context of data. Optimizing GPT-4 without retraining the model involves adjusting the way we use it, such as adjusting prompts and setting parameters for specific tasks.
Optimizing GPT-4 without retraining the model involves:
- Optimizing prompts. Crafting and refining prompts to guide the model in generating relevant summaries.
- Setting parameters. Adjusting settings like temperature, max tokens, top-p, frequency, and presence penalties to control the model’s output.
- Iterative feedback. Continuously improving the model’s performance through human feedback and A/B testing.
This approach allows us to provide more precise and relevant summaries, ensuring that we surface valuable patterns to help uncover untapped opportunities and make more informed decisions.
Ship to learn
At GitHub, our “ship to learn” ethos is deeply rooted in our history. We truly value the journey as much as the destination. We believe that we can learn from every failure and that those failures lead us closer to success.
At first, we weren’t sure how to effectively communicate the data we generated. Generating useful AI insights might be a difficult task, but telling a good story with them can be an even more difficult task. Good visuals can help inform better business decisions, while bad visuals can confuse the audience and impede efficiency. Understanding the context, choosing the right visuals, and only displaying the important information are key aspects to successfully communicate data. To understand the context completely, we needed to fully understand our audience’s needs, so we revisited the fundamental question of why we needed these insights in the first place. The specifics of what data to show and how to present it would follow.
Embracing the “ship to learn” mindset, we decided to quickly generate an Azure Data Explorer (ADX) dashboard for the first trials. We developed multiple visuals and shared them across the company to collect feedback. This process helped us identify which visualizations our internal users found valuable and which ones were less effective. It became clear that we needed a tailored tool that incorporated business-specific context into our data. Only then could we effectively tell stories such as “Here are the top 10 customer pain points in support tickets for X product/feature.” This meant that we needed to create our own tool with advanced filtering capabilities to effectively navigate the intricacies of our feedback insights. Additionally, we needed the ability to connect the insights generated by our internal systems, enabling us to prioritize actions more effectively.
This marked the beginning of developing our internal web application to communicate insights through visuals. We now had the data, the context, and the effective visuals. The final piece was ensuring we focused our audience’s attention on the most important insights. Attributes, such as position on the page, color, and size, can help direct the audience’s attention to the most important information. Once again, we decided to ship our minimum viable product (MVP) to start collecting feedback and iterating on the visuals most relevant to our product teams. Following its official internal launch, our tool began revealing valuable insights in massive customer feedback text datasets, unlocking an array of new use cases that we were eager to explore.
Real-world impact
Integrating AI into our feedback analysis process has driven impactful outcomes:
- Transitioning from manual classification to automated trend identification. Using automated AI-driven trend identification has significantly enhanced our ability to scale our data analysis efforts. This shift saves time and increases the precision with which we understand and respond to developer feedback in support tickets.
- Identifying and addressing common pain points. Clustering feedback helps us identify recurring problems quicker and address them more efficiently. This can minimize disruption and enhance user productivity on the platform.
- Improving feature prioritization. By understanding what our developer community needs most, we can focus our efforts on the features that will provide the greatest benefit to them.
- Making data-driven decisions. By taking advantage of the clear, summarized insights our tool provides, our internal teams can make more informed decisions that are more aligned with the needs and desires of our developer community.
- Discovering new self-serve opportunities. The insights generated enable the identification of self-help opportunities that empower customers to resolve issues on their own more swiftly. This expedites problem resolution for users and enhances their capability to manage future issues independently, reducing dependency on direct support.
Moving forward
As we continue to refine our AI-driven analytics capabilities and incorporate more sophisticated techniques, we are excited about the potential to further enhance our understanding of customer feedback. Our commitment to leveraging AI not only demonstrates our dedication to innovation but also ensures that the voice of our developers remains at the heart of everything we do.
In conclusion, using AI to analyze customer feedback has transformed how we interact with and respond to our developer community. By turning vast amounts of feedback text data into actionable insights, we are better equipped to meet the needs of our users and drive the future of software development.
Next time you provide feedback on one of our platforms or through a support ticket, keep this in mind and add as many details as you can. Your detailed feedback helps us make more informed decisions and improve GitHub for everyone.
Tags:
Written by
Related posts
How we evaluate AI models and LLMs for GitHub Copilot
We share some of the GitHub Copilot team’s experience evaluating AI models, with a focus on our offline evaluations—the tests we run before making any change to our production environment.
Documenting and explaining legacy code with GitHub Copilot: Tips and examples
Learn how to document and explain legacy code with GitHub Copilot with real-world examples.
How to use GitHub Copilot: What it can do and real-world examples
How Copilot can generate unit tests, refactor code, create documentation, perform multi-file edits, and much more.