September, 2022

article thumbnail

Enhancing Data Catalog with AI

David Menninger's Analyst Perspectives

Organizations are collecting data from multiple data sources and a variety of systems to enrich their analytics and business intelligence (BI). But collecting data is only half of the equation. As the data grows, it becomes challenging to find the right data at the right time. Many organizations can’t take full advantage of their data lakes because they don’t know what data actually exists.

Data Lake 266
article thumbnail

How is Big Data Helping in the Development of Healthcare?

Analytics Vidhya

This article was published as a part of the Data Science Blogathon. Introduction “Big data in healthcare” refers to much health data collected from many sources, including electronic health records (EHRs), medical imaging, genomic sequencing, wearables, payer records, medical devices, and pharmaceutical research. Its characteristics distinguish it from traditional electronic medical and human health data […].

Big Data 392
Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

How to Correctly Select a Sample From a Huge Dataset in Machine Learning

KDnuggets

We explain how choosing a small, representative dataset from a large population can improve model training reliability.

article thumbnail

Take Your SQL Skills To The Next Level With These Popular SQL Books

datapine

Business leaders, developers, data heads, and tech enthusiasts – it’s time to make some room on your business intelligence bookshelf because once again, datapine has new books for you to add. We have already given you our top data visualization books , top business intelligence books , and best data analytics books. Now it’s time to ponder over our hand-picked list of the 20 best SQL learning books available today.

article thumbnail

Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You Need to Know

Speaker: Timothy Chan, PhD., Head of Data Science

Are you ready to move beyond the basics and take a deep dive into the cutting-edge techniques that are reshaping the landscape of experimentation? 🌐 From Sequential Testing to Multi-Armed Bandits, Switchback Experiments to Stratified Sampling, Timothy Chan, Data Science Lead, is here to unravel the mysteries of these powerful methodologies that are revolutionizing how we approach testing.

article thumbnail

American Airlines takes flight with analytics transformation

CIO Business Intelligence

In the wake of the COVID-19 pandemic, airlines have struggled with bad weather, fewer air traffic controllers, and a shortage of pilots, all leading to an unprecedented number of cancelations in 2022. According to Reuters , more than 100,000 flights in the US were canceled between January and July, up 11% from pre-pandemic levels. American Airlines, the world’s largest airline, is turning to data and analytics to minimize disruptions and streamline operations with the aim of giving travelers a s

Analytics 145
article thumbnail

MLOps Helps Mitigate the Unforeseen in AI Projects

DataRobot Blog

The latest McKinsey Global Survey on AI proves that AI adoption continues to grow and that the benefits remain significant. But in the COVID-19 pandemic’s first year, many felt more strongly about the cost-savings front than the top line. At the same time, AI remains complex and out of reach for many. For example, a recent IDC study 1 shows that it takes about 290 days on average to deploy a model into production from start to finish.

Metrics 145

More Trending

article thumbnail

Blockchain Technology and its Types

Analytics Vidhya

This article was published as a part of the Data Science Blogathon. Introduction Blockchain technology is a decentralized, distributed ledger that keeps a record of ownership of digital assets. Any data stored on the blockchain cannot be modified, making the technology a legitimate disruptor for payments, cybersecurity, and healthcare industries. Blockchain is a system of registering […].

article thumbnail

How to Select Rows and Columns in Pandas Using [ ],loc, iloc,at and.iat

KDnuggets

Subset selection is one of the most frequently performed tasks while manipulating data. Pandas provides different ways to efficiently select subsets of data from your DataFrame.

160
160
article thumbnail

How to Avoid Burning Out if You Are a Data Scientist

Dataiku

This is a guest article from Eric Kahuha. Kahuha is an ambitious data scientist and an experienced technical writer. His work has been published in many blogs. He writes highly technical yet easy-to-understand content for beginners and experts in the tech field.

article thumbnail

What is employee experience? A vital factor for business success

CIO Business Intelligence

Employee experience has become a key factor in defining your company’s overall success. Positive or negative, employee experience can significantly impact your company’s productivity, efficiency, and its ability to recruit and retain talent. It can even impact your brand’s reputation long after an employee has exited the company. The COVID-19 pandemic has drastically changed the future of work by normalizing remote work , placing a new emphasis on workplace flexibility , and introducing hybrid w

article thumbnail

From Developer Experience to Product Experience: How a Shared Focus Fuels Product Success

Speaker: Anne Steiner and David Laribee

As a concept, Developer Experience (DX) has gained significant attention in the tech industry. It emphasizes engineers’ efficiency and satisfaction during the product development process. As product managers, we need to understand how a good DX can contribute not only to the well-being of our development teams but also to the broader objectives of product success and customer satisfaction.

article thumbnail

A 12-Point Checklist for Public and Open Data Sites (with Examples)

Juice Analytics

Let the data run free! Government organizations, academic institutions, non-profits, and even passionate sports fans are gathering and sharing valuable data sets with the public. The topics are wide ranging, from climate change to health to inequality to happiness. It is a powerful way to support a cause and encourage data-driven analysis. These open data sets are set loose on a website in hopes that interested visitors will come flocking.

article thumbnail

What Are the Most Serious Privacy Concerns Regarding Big Data?

Smart Data Collective

Given the growing importance of big data and the rising reliance of businesses on big data analytics to carry out their day-to-day operations, it is safe to say that big data has irrevocably altered the online world for anyone running a digital enterprise or an e-business. Big data’s invaluable insights are an essential factor in the success of enterprises.

Big Data 133
article thumbnail

Data Warehousing with Snowflake and Other Alternatives

Analytics Vidhya

This article was published as a part of the Data Science Blogathon. Introduction Over the past few years, Snowflake has grown from a virtual unknown to a retailer with thousands of customers. Businesses have adopted Snowflake as migration from on-premise enterprise data warehouses (such as Teradata) or a more flexibly scalable and easier-to-manage alternative to […].

article thumbnail

More Performance Evaluation Metrics for Classification Problems You Should Know

KDnuggets

When building and optimizing your classification model, measuring how accurately it predicts your expected outcome is crucial. However, this metric alone is never the entire story, as it can still offer misleading results. That's where these additional performance evaluations come into play to help tease out more meaning from your model.

Metrics 160
article thumbnail

Peak Performance: Continuous Testing & Evaluation of LLM-Based Applications

Speaker: Aarushi Kansal, AI Leader & Author and Tony Karrer, Founder & CTO at Aggregage

Software leaders who are building applications based on Large Language Models (LLMs) often find it a challenge to achieve reliability. It’s no surprise given the non-deterministic nature of LLMs. To effectively create reliable LLM-based (often with RAG) applications, extensive testing and evaluation processes are crucial. This often ends up involving meticulous adjustments to prompts.

article thumbnail

Getting Data Into Shape for Reporting with Power BI

Paul Turley

I see a lot of Power BI projects that we are asked to fix or performance tune, and at least nine times out of ten, the answer is that the data needs to be shaped and transformed so it is optimized for reporting.

Reporting 115
article thumbnail

Making AI accessible leads to greater innovation

CIO Business Intelligence

It’s difficult to visualise the true scale of AI, as it’s almost certainly more than you imagine – it’s going to contribute more to the global economy than the current GDP of India and China combined. PwC research suggests that AI could contribute as much as $15.7 trillion by 2030, and by singularly responsible for a 26 per cent boost in the GDP of local economies.

Testing 144
article thumbnail

Rejoice! The Vantage Analytics and Data Platform Provide Incredible Power for All in a “Cloudy” Environment

Teradata

With the release of VantageCloud Lake and ClearScape Analytics, Teradata brings a cloud-native architecture to extend the technical innovations and differentiators that Vantage is well known for.

article thumbnail

Data-Driven Companies Leverage OCR for Optimal Data Quality

Smart Data Collective

OCR is the latest new technology that data-driven companies are leveraging to extract data more effectively. There are a number of benefits of using it to your company’s advantage. OCR and Other Data Extraction Tools Have Promising ROIs for Brands. Big data is changing the state of modern business. A growing number of companies have leveraged big data to cut costs, improve customer engagement, have better compliance rates and earn solid brand reputations.

article thumbnail

Entity Resolution Checklist: What to Consider When Evaluating Options

Are you trying to decide which entity resolution capabilities you need? It can be confusing to determine which features are most important for your project. And sometimes key features are overlooked. Get the Entity Resolution Evaluation Checklist to make sure you’ve thought of everything to make your project a success! The list was created by Senzing’s team of leading entity resolution experts, based on their real-world experience.

article thumbnail

Get to Know All About Evaluation Metrics

Analytics Vidhya

This article was published as a part of the Data Science Blogathon. Introduction Evaluation metrics are used to measure the quality of the model. Selecting an appropriate evaluation metric is important because it can impact your selection of a model or decide whether to put your model into production. The mportance of cross-validation: Are evaluation metrics […].

Metrics 375
article thumbnail

SQL vs NoSQL: 7 Key Takeaways

KDnuggets

People assume that NoSQL is a counterpart to SQL. Instead, it’s a different type of database designed for use-cases where SQL is not ideal. The differences between the two are many, although some are so crucial that they define both databases at their cores.

160
160
article thumbnail

Driving Innovation Through Data and Analytics

Dataiku

When we think about innovation, most of us default to innovation on product/servicing offerings. While offering innovation is very much part of the innovation process, it’s not the only type of innovation, and some might even argue it’s the easiest for competitors to copy. And regardless of what we think innovation is, many of us may wonder how to innovate beyond just relying on the instincts of talented individuals.

article thumbnail

What you need to know about IoT in enterprise and education

CIO Business Intelligence

What you need to know about IoT in enterprise and education . In an era of data driven insights and automation, few technologies have the power to supercharge and empower decision makers like that of the Internet of Things (IoT). . As the adoption of IoT devices is expected to reach 24.1 billion by 2030, forward-thinking organisations and higher education institutions are realising that IoT technologies are providing access to insights and making things possible now that were too expensiv

IoT 137
article thumbnail

Strategic CX: A Deep Dive into Voice of the Customer Insights for Clarity

Speaker: Nicholas Zeisler, CX Strategist & Fractional CXO

The first step in a successful Customer Experience endeavor (or for that matter, any business proposition) is to find out what’s wrong. If you can’t identify it, you can’t fix it! 💡 That’s where the Voice of the Customer (VoC) comes in. Today, far too many brands do VoC simply because that’s what they think they’re supposed to do; that’s what all their competitors do.

article thumbnail

Data Governance and Strategy for the Global Enterprise

Cloudera

In a recent blog, Cloudera Chief Technology Officer Ram Venkatesh described the evolution of a data lakehouse, as well as the benefits of using an open data lakehouse, especially the open Cloudera Data Platform (CDP). If you missed it, you can read up about it here. Modern data lakehouses are typically deployed in the cloud. Cloud computing brings several distinct advantages that are core to the lakehouse value proposition.

article thumbnail

Can Data Mining Aid with Off-Page SEO Strategies?

Smart Data Collective

Data mining technology has led to some important breakthroughs in modern marketing. Even major companies like HubSpot have talked extensively about the benefits of using data mining for marketing. One of the most important ways that companies can use data mining in their marketing strategies is with SEO. Data mining is especially useful in the context of offsite SEO.

article thumbnail

Basic Concept Behind Apache Hive and Elasticsearch

Analytics Vidhya

This article was published as a part of the Data Science Blogathon. Introduction I’ve always wondered how big companies like Google process their information or how companies like Netflix can perform searches in concise times. That’s why I want to tell you about my experience with two powerful tools they use: Apache Hive and Elasticsearch. […].

article thumbnail

Welcome to TensorFlow!

KDnuggets

TensorFlow in Action teaches you to construct, train, and deploy deep learning models using TensorFlow 2. In this practical tutorial, you’ll build reusable skills hands-on as you create production-ready applications.

article thumbnail

The Big Payoff of Application Analytics

Outdated or absent analytics won’t cut it in today’s data-driven applications – not for your end users, your development team, or your business. That’s what drove the five companies in this e-book to change their approach to analytics. Download this e-book to learn about the unique problems each company faced and how they achieved huge returns beyond expectation by embedding analytics into applications.

article thumbnail

Data Science for Dummies

Dataiku

"How can you work with data scientists? You never liked math!".

article thumbnail

IT budgets remain solid, despite tech industry headwinds

CIO Business Intelligence

A report issued Monday by private investment company Bain Capital indicated that, despite the numerous disruptions to the technology industry—including a global supply chain crisis and Russia’s invasion of Ukraine—most IT decision makers foresee either stable budgets or increases for the coming year. Over the past two years, the pandemic’s effects on that figure have been noticeable—at the onset, less than half of those polled said that they expected anything but a decrease in their budget for t

IT 137
article thumbnail

Large Scale Industrialization Key to Open Source Innovation

Cloudera

We are now well into 2022 and the megatrends that drove the last decade in data — The Apache Software Foundation as a primary innovation vehicle for big data, the arrival of cloud computing, and the debut of cheap distributed storage — have now converged and offer clear patterns for competitive advantage for vendors and value for customers. Cloudera has been parlaying those patterns into clear wins for the community at large and, more importantly, streamlining the benefits of that innovation to

Big Data 109
article thumbnail

Roles of Python Developer in Data Science Teams

Smart Data Collective

Data science is a very complex field that requires the insights of professionals from many different disciplines. One of the fields of professionals that are so important for data science projects are Python developers. What is the Python programming language? Why is it so important in the data science profession ? What Is Python? Python is a powerful programming language that is widely used in many different industries today.

article thumbnail

How to Build an Experimentation Culture for Data-Driven Product Development

Speaker: Margaret-Ann Seger, Head of Product, Statsig

Experimentation is often seen as an aspirational practice, especially at smaller, fast-moving companies who are strapped for time and resources. So, how can you get your team making decisions in a more data-driven way while continuing to remain lean and maintaining ship velocity? In this webinar, Margaret-Ann Seger, Head of Product at Statsig, will teach you how to build an experimentation culture from the ground-up, graduating from just getting started with data-driven development to operating