Data Leaders Brief

No Training Data? No Problem!

Dataiku

MARCH 31, 2023

A significant quantity of training data has long been a key requirement of successful machine learning (ML) projects. In this blog post, we will see that new state-of-the-art approaches make it possible to mitigate or overcome this constraint in the context of computer vision.

Machine Learning

Machine Learning IT

ChatGPT, Author of The Quixote

O'Reilly on Data

MARCH 26, 2024

TL;DR LLMs and other GenAI models can reproduce significant chunks of training data. Specific prompts seem to “unlock” training data. Generative AI Has a Plagiarism Problem ChatGPT, for example, doesn’t memorize its training data, per se.

Modeling

Modeling Machine Learning Risk Advertising

Model Collapse: An Experiment

O'Reilly on Data

OCTOBER 24, 2023

Ever since the current craze for AI-generated everything took hold, I’ve wondered: what will happen when the world is so full of AI-generated stuff (text, software, pictures, music) that our training sets for AI are dominated by content created by AI. At some point in the near future, new models will be trained on code that they have written.

Modeling

Modeling Statistics Software Reporting

Webinars

How to Optimize the Developer Experience for Monumental Impact

Generative AI Deep Dive: Advancing from Proof of Concept to Production

Understanding User Needs and Satisfying Them

Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You Need to Know

Leading the Development of Profitable and Sustainable Products

MORE WEBINARS

Risk Management for AI Chatbots

O'Reilly on Data

JUNE 27, 2023

But first, let’s dig deeper into the problem. Old Problems Are New Again The text-box-and-submit-button combo exists on pretty much every website. Those 1990s web forms demonstrate the problem all too well. Those 1990s web forms demonstrate the problem all too well. That code was too trusting, though.

Risk Management

Risk Management Risk Management Modeling

Copyright, AI, and Provenance

O'Reilly on Data

DECEMBER 12, 2023

Another group of cases involving text (typically novels and novelists) argue that using copyrighted texts as part of the training data for a Large Language Model (LLM) is itself copyright infringement, 1 even if the model never reproduces those texts as part of its output. How do we make sense of this?

Modeling

Modeling Software Sales Statistics

10 things to watch out for with open source gen AI

CIO Business Intelligence

MAY 15, 2024

Even if you don’t have the training data or programming chops, you can take your favorite open source model, tweak it, and release it under a new name. If you have a data center that happens to have capacity, why pay someone else?” It’s also the training data, model weights, and fine tuning.

Modeling

Modeling Risk Software Enterprise

Automated Mentoring with ChatGPT

O'Reilly on Data

OCTOBER 10, 2023

The Mentor role is particularly important to the work we do at O’Reilly in training people in new technical skills. Programming (like any other skill) isn’t just about learning the syntax and semantics of a programming language; it’s about learning to solve problems effectively. However, it isn’t a serious problem.

Testing

Testing Modeling IT Risk

When your AI chatbots mess up

CIO Business Intelligence

DECEMBER 8, 2023

But as the numbers of new gen AI-powered chatbots grow, so do the risks of their occasional glitches—nonsensical or inaccurate outputs or answers that are not easily screened out of the large language models (LLMs) that the tools are trained on. Hallucinations occur when the data being used to train LLMs is of poor quality or incomplete.

Risk

Risk Finance Data Quality Modeling

4 paths to sustainable AI

CIO Business Intelligence

JANUARY 31, 2024

Everything from geothermal data centers to more efficient graphic processing units (GPUs) can help. But AI users must also get over the urge to use the biggest, baddest AI models to solve every problem if they truly want to fight climate change. All those 13,000 new models didn’t require any pre-training,” he says.

Cost-Benefit

Cost-Benefit Modeling Testing IoT

Porsche Carrera Cup Brasil gets real-time data boost

CIO Business Intelligence

MAY 21, 2024

In the annual Porsche Carrera Cup Brasil, data is essential to keep drivers safe and sustain optimal performance of race cars. Until recently, getting at and analyzing that essential data was a laborious affair that could take hours, and only once the race was over. The process took between 30 minutes and two hours.

Broadcasting

Broadcasting Recreation/Entertainment Manufacturing Data Lake

AI Has an Uber Problem

O'Reilly on Data

APRIL 4, 2024

“The economic problem of society…is a problem of the utilization of knowledge which is not given to anyone in its totality.” They weren’t buying users with subsidized prices; they were building data centers. In the case of artificial intelligence, training large models is indeed expensive, requiring large capital investments.

Marketing

Marketing Modeling Finance Experimentation

How BayCare Health System excels in raising data literacy

CIO Business Intelligence

MAY 15, 2024

Martha Heller: What does data literacy mean to BayCare Health System? When the environmental services team who cleans our operating rooms has the data to flip an OR quickly to get a new patient in, they work more efficiently. What data is most important for you right now? We’re using data to reduce that wait time.

Cost-Benefit

Cost-Benefit Data Governance Data Architecture Reporting

5 things on our data and AI radar for 2021

O'Reilly on Data

FEBRUARY 19, 2021

ML presents a problem for CI/CD for several reasons. The data that powers ML applications is as important as code, making version control difficult; outputs are probabilistic rather than deterministic, making testing difficult; training a model is processor intensive and time consuming, making rapid build/deploy cycles difficult.

Data Lake

Data Lake Data Warehouse Machine Learning Modeling

What you need to know about product management for AI

O'Reilly on Data

MARCH 31, 2020

AI products are automated systems that collect and learn from data to make user-facing decisions. All you need to know for now is that machine learning uses statistical techniques to give computer systems the ability to “learn” by being trained on existing data. Why AI software development is different.

Management

Management Machine Learning Experimentation Metrics

The extent Automic’s group CIO goes to reconcile data

CIO Business Intelligence

MAY 8, 2024

But that created problems because I was using all my personal time. Yet from the investor standpoint, there was no problem; he’s delivering. When that happens, the former registry needs to give us all the client data in the form of their choosing. Then we have to make sense of the data, massage it and import it in our system.

Risk

Risk Testing Strategy Technology

Power, Harms, and Data

O'Reilly on Data

JULY 28, 2020

It had similar problems with other images of Black and Hispanic people, frequently giving them White skin and facial features. Is this just a problem with training data, as Yann LeCun said on Twitter? Is this just a problem with training data, as Yann LeCun said on Twitter? Who stands to gain?

Testing

Testing Recreation/Entertainment Statistics Software

AI Adoption in the Enterprise 2021

O'Reilly on Data

APRIL 19, 2021

During the first weeks of February, we asked recipients of our Data & AI Newsletter to participate in a survey on AI adoption in the enterprise. The second-most significant barrier was the availability of quality data. Relatively few respondents are using version control for data and models. Respondents.

Enterprise

Enterprise Modeling Risk Manufacturing

IT leaders look beyond LLMs for gen AI needs

CIO Business Intelligence

MAY 21, 2024

But not all problems are best solved using LLMs, some IT leaders say, opening a next wave of multimodal models that go beyond language to deliver more purposeful results — for example, to handle dynamic tabular data stored in spreadsheets and vector databases, video, and audio data.

IT

IT Cost-Benefit Experimentation Forecasting

Exposure to new workplace technologies linked to lower quality of life

CIO Business Intelligence

MARCH 18, 2024

Commenting on the research, several IT experts pointed to a problem, not with the new technologies themselves, but with the training and enterprise culture around them. In the face of a problem, they are expected to work 24 hours a day to resolve it. That means it’s key to the business’s survival.

Technology

Technology Uncertainty Machine Learning Interactive

The top 15 big data and data analytics certifications

CIO Business Intelligence

JUNE 14, 2023

Data and big data analytics are the lifeblood of any successful business. Getting the technology right can be challenging but building the right team with the right skills to undertake data initiatives can be even harder — a challenge reflected in the rising demand for big data and analytics skills and certifications.

Big Data

Big Data Data Analytics Analytics Predictive Modeling

4 ways generative AI addresses manufacturing challenges

IBM Big Data Hub

APRIL 15, 2024

Technology and disruption are not new to manufacturers, but the primary problem is that what works well in theory often fails in practice. Or we create a data lake, which quickly degenerates to a data swamp. Contextual data understanding Data systems often cause major problems in manufacturing firms.

Manufacturing

Manufacturing Contextual Data Knowledge Discovery Data Lake

The ‘Great Retraining’: IT upskills for the future

CIO Business Intelligence

SEPTEMBER 18, 2023

When the timing was right, Chavarin honed her skills to do training and coaching work and eventually got her first taste of technology as a member of Synchrony’s intelligent virtual assistant (IVA) team, writing human responses to the text-based questions posed to chatbots. Maggie Chavarin is no stranger to reinventing her career.

IT

IT Cost-Benefit Technology Business Intelligence

Assessing the business risk of AI bias

CIO Business Intelligence

JUNE 9, 2023

AI doesn’t get better than the data it’s trained on. And many CIOs and other senior managers are aware of the problem, according to an international survey commissioned by Swedish software supplier Progress. It’s a challenge because the occasions with such cold are so rare that there is simply not much to train on,” she says.

Risk

Risk Machine Learning Modeling Technology

Where’s the ROI for AI? CIOs struggle to find it

CIO Business Intelligence

MAY 22, 2024

While Kane shows clients how to save time and money using AI tools like Microsoft Copilot, many SMB customers still don’t see the value of generative AI in tasks like writing a newsletter, when the AI doesn’t have access to their internal data. What you really want to be doing is finding a problem to solve with it first.”

ROI

ROI IT Measurement Cost-Benefit

7 dark secrets of generative AI

CIO Business Intelligence

SEPTEMBER 12, 2023

They are data sieves Humans have tried to create an elaborate hierarchy of knowledge where some details are known to insiders and some are shared with everyone. Already there are several high profile examples involving company data leaks and LLM guardrails being circumvented. Can you train your AI on your customers’ data?

Enterprise

Enterprise Consulting Technology IT

Multilingual Question Answering in Medicine based on XLM-RoBERTa

Ontotext

MARCH 15, 2024

This is part of Ontotext’s AI-in-Action initiative aimed at enabling data scientists and engineers to benefit from the AI capabilities of our products. Furthermore, as the clinical data is highly sensitive, there are no open-access models or datasets available to solve the task, especially in the multilingual setting.

Modeling

Modeling Metadata Testing Optimization

How Starlink transformed tech operations for Journey Beyond

CIO Business Intelligence

JANUARY 4, 2024

Journey Beyond’s biggest investment is in rail, owning and operating four luxury trains that transverse the continent. From a technical point of view, the rapid growth brought new opportunities, but also problems and confusion. This meant there was no real-time data on operations. Perhaps quite the opposite.”

Data-driven

Data-driven Sales Strategy Management

Generative AI hallucinations: What can IT do?

CIO Business Intelligence

NOVEMBER 7, 2023

What IT can do about generative AI hallucinations Fortunately, there are actions IT organizations can take to reduce the risk of generative AI hallucinations—either through decisions they make within their own environments or how internal users are trained to use existing tools. Here are a range of options IT can use to get started.

IT

IT Risk Modeling Strategy

A forensic look to modernize tech at South Africa’s SIU

CIO Business Intelligence

NOVEMBER 23, 2023

But as she points out, they aren’t entirely cloud-based, instead embracing a hybrid strategy because data sovereignty laws and governance structures dictate what kind of information they can and can’t store in the cloud. This made the data transfer process decidedly more complex. “If

Digital Transformation

Digital Transformation Strategy Management Technology

Business AI will change the way businesses are run

CIO Business Intelligence

OCTOBER 13, 2023

Simply by asking a question in plain language, our customers will get smart answers drawn from a pool of data from across the SAP portfolio and third-party sources. We have agreements with more than 25,000 customers to use their data in an anonymized way to train our own models.

Contextual Data

Contextual Data Optimization Sales Technology

6 business execs you’ll meet in hell — and how to deal with them

CIO Business Intelligence

JULY 10, 2023

Some steer clear of anything that smacks of interpersonal conflict, allowing small problems to fester into bigger ones. Roughly a decade ago, Mark Campbell was working as an advisor to the head data scientist of a large communications company. And some bosses or business colleagues are borderline sociopaths. They’re working 24-7.

Cost-Benefit

Cost-Benefit Sales Technology Management

Big Data Makes Smart Buildings the Norm in the 21st Century

Smart Data Collective

MARCH 21, 2022

It is one of the biggest trends driven by big data. And they can generate more data. Building management systems (BMS) do not, however, leverage the data from their smart buildings. They can use the data to make important decisions. Unfortunately, they do not even capture data from their modern buildings.

Big Data

Big Data Internet of Things IoT Dashboards

The data flywheel: A better way to think about your data strategy

CIO Business Intelligence

OCTOBER 25, 2022

Data & Analytics is delivering on its promise. Every day, it helps countless organizations do everything from measure their ESG impact to create new streams of revenue, and consequently, companies without strong data cultures or concrete plans to build one are feeling the pressure. We discourage that thinking.

Data Strategy

Data Strategy Strategy Data Lake Data-driven

AI-Driven Ransomware is a Terrifying Threat to Businesses

Smart Data Collective

SEPTEMBER 27, 2022

AI can train malware to evade antivirus protection software and bypass other elements of the computer security system. Unlike phishing, which doesn’t seem to have a simple exit option, ransomware has a payment wall in place that could alleviate the problem. They can also use AI technology to make their ransomware more vicious.

Machine Learning

Machine Learning Data-driven Software IT

Innovating Services for a Digital, Intelligent Future

CIO Business Intelligence

MAY 7, 2024

Designing and deploying a new ICT infrastructure, capable of handling the masses of data AI requires, is inherently complex. High-performance computing requirements lead to ever-higher power consumption, making greenhouse gas emissions a significant problem. Then there are also significant infrastructural challenges to deal with.

Digital Transformation

Digital Transformation IoT Consulting Big Data

8 big IT failures of 2023

CIO Business Intelligence

DECEMBER 26, 2023

Every problem is a teachable moment, of course, and we hope these disasters can serve as cautionary tales as you try to navigate your own potential IT troubles in 2024. The company used software from two different vendors for the purposes of “interoperability testing, validation and customer proofs of concept, training and customer support.”

IT

IT Software Reporting Testing

Transform the modern data center: From today to the future

CIO Business Intelligence

MAY 2, 2024

A seismic shift is underway in the evolution of the data center, driven by a variety of converging factors. EMA Research underscores the significant impact of manual errors, such as misconfigurations, which account for 27% of network problems. Learn more about high-performance and sustainable data center solutions from Cisco.

Data-driven

Data-driven Optimization Machine Learning Modeling

What is root cause analysis? A proactive approach to change management

CIO Business Intelligence

MAY 6, 2022

Root cause analysis (RCA) is a problem-solving process that focuses on identifying the root cause of issues or errors with the goal of preventing them from reoccurring in the future. When a problem is identified and removed, it is considered a “root cause” if it prevents the problem from reoccurring. Root cause analysis steps.

Management

Management Risk Management Visualization Risk

The change management Informatica needed to overhaul its business model

CIO Business Intelligence

MARCH 6, 2024

Many of our customers had already started to move their applications and it made sense they would want to transition to data management in the cloud as well. We built that end-to-end data model and process from scratch while we ran the old business. This isn’t just an IT or sales transformation; it’s a complete company transformation.

Modeling

Modeling Management IT Metrics

Augmented Analytics Must Provide Data Quality and Insight!

Smarten

APRIL 25, 2024

How Can I Ensure Data Quality and Gain Data Insight Using Augmented Analytics? There are many business issues surrounding the use of data to make decisions. One such issue is the inability of an organization to gather and analyze data.

Data Quality

Data Quality Analytics Machine Learning Visualization

Introducing the technology behind watsonx.ai, IBM’s AI and data platform for enterprise

IBM Big Data Hub

MAY 9, 2023

Over the past decade, deep learning arose from a seismic collision of data availability and sheer compute power, enabling a host of impressive AI capabilities. It sounds like a joke, but it’s not, as anyone who has tried to solve business problems with AI may know. But this is starting to change.

Enterprise

Enterprise Technology Modeling Cost-Benefit

Colorado AI legislation further complicates compliance equation

CIO Business Intelligence

MAY 10, 2024

Brian Levine, a manager partner at Ernst & Young who is also an attorney, reviewed the bill and doesn’t expect ignorance of a third party’s use of AI to be a major problem. “If These hidden AI activities, what Computerworld has dubbed sneaky AI , could potentially come to bear in compliance with legislation such as this. That’s legal.

Risk

Risk Insurance Machine Learning Sales

Governance and Fighting the Curse of Complexity

CIO Business Intelligence

MARCH 20, 2024

The boulder is made from complex infrastructure, network connections, data stores, and devices. Moreover, new sources of ever expanding data produced by generative AI and the unfettered growth of unstructured data introduce even more challenges. Training and awareness. Data at rest. Data in motion.

IoT

IoT Unstructured Data Insurance Data Processing

5 ways to deploy your own large language model

CIO Business Intelligence

NOVEMBER 16, 2023

Deploying public LLMs Dig Security is an Israeli cloud data security company, and its engineers use ChatGPT to write code. But there’s a problem with it — you can never be sure if the information you upload won’t be used to train the next generation of the model. There’s no perfect solution. Dig Security isn’t alone.

Modeling

Modeling Enterprise Marketing Sales

No Training Data? No Problem!

ChatGPT, Author of The Quixote

Webinars

Trending Sources

Model Collapse: An Experiment

Webinars

Risk Management for AI Chatbots

Copyright, AI, and Provenance

10 things to watch out for with open source gen AI

Automated Mentoring with ChatGPT

When your AI chatbots mess up

4 paths to sustainable AI

Porsche Carrera Cup Brasil gets real-time data boost

AI Has an Uber Problem

How BayCare Health System excels in raising data literacy

5 things on our data and AI radar for 2021

What you need to know about product management for AI

The extent Automic’s group CIO goes to reconcile data

Power, Harms, and Data

AI Adoption in the Enterprise 2021

IT leaders look beyond LLMs for gen AI needs

Exposure to new workplace technologies linked to lower quality of life

The top 15 big data and data analytics certifications

4 ways generative AI addresses manufacturing challenges

The ‘Great Retraining’: IT upskills for the future

Assessing the business risk of AI bias

Where’s the ROI for AI? CIOs struggle to find it

7 dark secrets of generative AI

Multilingual Question Answering in Medicine based on XLM-RoBERTa

How Starlink transformed tech operations for Journey Beyond

Generative AI hallucinations: What can IT do?

A forensic look to modernize tech at South Africa’s SIU

Business AI will change the way businesses are run

6 business execs you’ll meet in hell — and how to deal with them

Big Data Makes Smart Buildings the Norm in the 21st Century

The data flywheel: A better way to think about your data strategy

AI-Driven Ransomware is a Terrifying Threat to Businesses

Innovating Services for a Digital, Intelligent Future

8 big IT failures of 2023

Transform the modern data center: From today to the future

What is root cause analysis? A proactive approach to change management

The change management Informatica needed to overhaul its business model

Augmented Analytics Must Provide Data Quality and Insight!

Introducing the technology behind watsonx.ai, IBM’s AI and data platform for enterprise

Colorado AI legislation further complicates compliance equation

Governance and Fighting the Curse of Complexity

5 ways to deploy your own large language model

Stay Connected