vAIsual Inc, the company behind the largest visual dataset collection in the world, today launched the first of it’s non-biometric datasets, consisting of over 130,000 images of elephants, dogs and birds.
The Elephants In The Wild dataset, with over 24,000 images, will play a crucial role in training AI models to recognize, classify, and analyze elephant-related data. The resulting trained models can contribute to conservation efforts, habitat management, and wildlife protection, ultimately helping to safeguard elephant populations and their ecosystems.
Avian research for automatic species identification and environmental monitoring will be fueled by the Birds In The Wild dataset, containing over 5000 birds. The AI detection systems in drone and self-driving cars will be one use case for this data.
The third dataset released today is a large and Diverse Dogs Collection dataset. With over 110,000 the dog dataset serves as a valuable resource for training AI models to recognize dog breeds, detect dogs in images or videos, analyze dog behaviors, and contribute to various applications in veterinary science, pet adoption, and animal welfare. The trained models can assist in dog-related tasks, benefiting dog owners, breeders, veterinarians, and researchers alike.
The datasets are specially prepared to meet the needs of ML teams, such as detailed and consistent metatags, high resolution images and, most importantly, legal clearances.
Self-service access to the datasets is via the Dataset Shop, established in 2022 by clean data specialists vAIsual Inc, and specifically catering for research and engineering teams training AI for a range of applications.
According to vAIsual CEO, Michael Osterrieder, these datasets are the first of thousands going live on the site in the next few weeks. “After concluding licensing deals for over 400 million images, we now have the largest collection of licensable images for AI training available anywhere.”
“We are excited to launch three new datasets that focus on elephants, dogs and birds respectively. Using our proprietary dataset building technology, we can now assemble datasets consisting of tens of thousands of images of a particular theme or subject.
Being able to collate and package these datasets saves hundreds of hours for engineers to prepare material for AI training.” says Osterrieder.
While reducing time is a core benefit, Osterrieder also emphasizes the importance of having full legal clearance.
“We are starting to see dataset disclosure requirements emerging in some jurisdictions, which will mean any AI model trained on scraped data will risk being blocked,” says Osterrieder.
The availability of legally clean datasets, that also remunerate the original content creators, is an important step to ensure companies building AI technology are doing it ethically and responsibly.
“Offering custom-prepared datasets containing premium visual content, with the consent of the original copyright owners (or their legal representatives). is essential for the AI industry to mature into a truly commercial and viable industry,” says Osterrieder.
In the coming weeks, additional datasets will be added to the datasetshop.com. The datasets are specially prepared for engineers to add to their workflow for AI training and are commercially available in a variety of resolutions.
About Dataset Shop
First launched in 2022 by the “clean data guys”, vAIsual Inc’s Dataset Shop is a marketplace for visual media designed specifically for AI training purposes.
The online store initially sold the largest biometrically released human dataset, consisting of over 600,000 high quality images, custom shot for AI training.
The Dataset Shop is rapidly growing the collection of datasets through partnerships with stock agencies seeking to address the issue of widespread scraping of datasets, obtained without the consent of copyright owners.
About vAIsual (pro-nounced v-eye-sual)
vAIsual was first formed in 2020 by Michael Osterrieder and Nicolas Menijes, soon to be joined by industry veterans Mark Milstein and Istvan Novak. All founders are well connected to the IP licensing industry.
vAIsual covers the whole AI workflow, from dataset generation and delivery, to optimizing training sessions. They offer generated content to the commercial advertising industry as well as for the machine learning industry.
vAIsual exclusively relies on ethically sourced and legally clean datasets.