
Datasaur
Founded Year
2019Stage
Seed VC - III | AliveTotal Raised
$8.95MLast Raised
$5M | 1 yr agoMosaic Score The Mosaic Score is an algorithm that measures the overall financial health and market potential of private companies.
+24 points in the past 30 days
About Datasaur
Datasaur is a company specializing in NLP data labeling and LLM development platforms. They offer a suite of tools for customizable data annotation, quality control management, and automation to enhance the efficiency of NLP and LLM projects. Datasaur's products are designed to meet the complex needs of industries such as legal, healthcare, financial, media, e-commerce, and government. It was founded in 2019 and is based in Livermore, California.
Loading...
Datasaur's Product Videos
_thumbnail.png?w=3840)

ESPs containing Datasaur
The ESP matrix leverages data and analyst insight to identify and rank leading companies in a given technology landscape.
The data annotation market provides services for labeling large volumes of data in preparation for training AI and ML models. This market comprises both text and image & video annotation services. Most companies employ human annotators to classify and label datasets, with some offering AI-powered automation tools to speed up the process.
Datasaur named as Challenger among 15 other companies, including Intel, Baidu, and Scale.
Datasaur's Products & Differentiators
Text Labeling
Datasaur's primary product. Supports all types of text labeling products.
Loading...
Research containing Datasaur
Get data-driven expert analysis from the CB Insights Intelligence Unit.
CB Insights Intelligence Analysts have mentioned Datasaur in 1 CB Insights research brief, most recently on Oct 3, 2023.
Expert Collections containing Datasaur
Expert Collections are analyst-curated lists that highlight the companies you need to know in the most important technology spaces.
Datasaur is included in 1 Expert Collection, including Artificial Intelligence.
Artificial Intelligence
14,767 items
Companies developing artificial intelligence solutions, including cross-industry applications, industry-specific products, and AI infrastructure solutions.
Latest Datasaur News
Oct 30, 2023
Datasaur, a leading natural language processing (NLP) data-labeling platform, is debuting LLM Lab, an all-in-one comprehensive interface for data scientists and engineers to build and train custom LLM models such as ChatGPT. According to the company, the product will provide a wide range of features for users to test different foundation models, connect to their own internal documents, optimize server costs, and more. “We regularly connect with data science teams around the world looking to build their own LLMs,” said Ivan Lee, CEO and founder of Datasaur. “We’ve built a tool that holistically addresses the most common pain points, supports rapidly evolving best practices, and applies our signature design philosophy to simplify and streamline the process. Over the past year, we have constructed and delivered custom models for our own internal use and our clients, and from that experience, we were able to create a scalable, easy-to-use LLM product.” Datasaur works with companies like Google and Blackbird to help label data 5.9x faster than manual labeling. The company has spent the last four years developing a comprehensive NLP solution, supporting methods like entity recognition, text classification, speaker diarization, and more, according to the vendor. As Generative AI has captured the industry’s attention, LLM Lab complements Datasaur’s existing NLP platform to provide a one-stop shop for all things related to text, documents, and audio. The company has seen an increasing trend to adopt a hybrid approach, complementing traditional NLP models with LLM capabilities. Datasaur’s platform will now support data scientists in both approaches, even allowing them to mix approaches and use LLMs to automate data labeling for traditional models. As we head into 2024, Datasaur will continue to invest in LLM development to fortify its position as the AI industry’s leading NLP platform, the company said. LLM Lab will help save the most successful configurations and prompts and allow users to share their findings with colleagues. It will continue integrating with popular and up-and-coming foundation models such as LlaMa 2, Falcon, and Claude, along with technologies such as Pinecone LLM to slot seamlessly into model training workflows, according to the company. For more information about this news, visit https://datasaur.ai . Free
Datasaur Frequently Asked Questions (FAQ)
When was Datasaur founded?
Datasaur was founded in 2019.
Where is Datasaur's headquarters?
Datasaur's headquarters is located at 630 Selby Lane, Livermore.
What is Datasaur's latest funding round?
Datasaur's latest funding round is Seed VC - III.
How much did Datasaur raise?
Datasaur raised a total of $8.95M.
Who are the investors of Datasaur?
Investors of Datasaur include Initialized Capital, Gold House Ventures, TenOneTen Ventures, HNVR Technology Investment Management, J17 Capital and 7 more.
Who are Datasaur's competitors?
Competitors of Datasaur include Scale, Snorkel AI, Select Star, HumanSignal, Labelbox and 7 more.
What products does Datasaur offer?
Datasaur's products include Text Labeling and 1 more.
Loading...
Compare Datasaur to Competitors

Hive is a leading provider of cloud-based AI solutions in the fields of content understanding, search, and generation. The company offers a suite of pre-trained AI models and turnkey software for tasks such as content moderation, brand protection, and data labeling. Hive's technology is widely used in platform integrity, sponsorship measurement, and context-based advertising among other applications. It was founded in 2013 and is based in San Francisco, California.

Labelbox develops a training data platform for machine learning teams to build real-world artificial intelligence (AI) solutions. The platform consists of label editor tools for batch, and real-time labeling workflows, collaboration, quality review, analytics, and more. It serves the government, retail, insurance, manufacturing, and healthcare sectors. It was founded in 2018 and is located in San Francisco, California.

CloudFactory focuses on providing workforce solutions for machine learning and business process optimization. The company offers services such as data labeling, accelerated annotation, and human-in-the-loop automation, which support workflows and fill gaps in artificial intelligence (AI) and automation. CloudFactory primarily serves sectors such as the autonomous vehicles industry, finance, healthcare, insurance, and retail. It was founded in 2010 and is based in Reading, United Kingdom.

Defined.ai provides a range of pre-collected and structured training datasets, including text, voice, and image data, and hosts an online marketplace where these datasets can be bought, sold, or commissioned. Defined.ai caters to the AI development sector, providing data that aids in the creation of fair, accessible, and ethical AI solutions. The company was founded in 2015 and is based in Seattle, Washington.

Explosion runs a software company focusing on artificial intelligence (AI) and natural language processing. The company offers developer tools to facilitate machine learning and data annotation, as well as advanced natural language processing capabilities. Its services primarily cater to the AI and machine learning industry. It was founded in 2016 and is based in Berlin, Germany.

Scale provides a data engine platform. The platform provides generative artificial intelligence (AI) strategy including fine-tuning, prompt engineering, security, model safety, model evaluation, and enterprise applications. It serves industries such as retail, electronic commerce, logistics, and more. Scale was formerly known as Scale Labs. It was founded in 2016 and is based in San Francisco, California.
Loading...