Mon.Sep 13, 2021

article thumbnail

Beginner’s Guide To Create PySpark DataFrame

Analytics Vidhya

This article was published as a part of the Data Science Blogathon Spark is a cluster computing platform that allows us to distribute data and perform calculations on multiples nodes of a cluster. The distribution of data makes large dataset operations easier to process. Here each node is referred to as a separate machine working on […]. The post Beginner’s Guide To Create PySpark DataFrame appeared first on Analytics Vidhya.

article thumbnail

Data Loss: Hazards, Risks and Strategies for Prevention

Smart Data Collective

Many organizations and enterprises are constantly under threat of a cyber attack. Although data may be lost in a hacking incident, it can also be due to other intentional or accidental reasons. For example, you cannot rule out physical data theft, human error, computer viruses, faulty hardware, power failure, and natural disasters. One way to mitigate the loss of vital information is to have a sound backup system, which will improve the chances of recovering the data.

Risk 122
Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Learning Text Classification Using the fastText Library

Analytics Vidhya

This article was published as a part of the Data Science Blogathon Introduction Let’s look at a practical application of the supervised NLP fastText model for detecting sarcasm in news headlines. About 80% of all information is unstructured, and text is one of the most common types of unstructured data. Due to its chaotic nature, analyzing, […].

article thumbnail

Great Benefits of Leveraging Big Data in Investing

Smart Data Collective

What is value investing? It is when an investor gets stock at cheaper prices than the actual value of the stock. However, value investing is challenging for most people. Successful investors find suitable assets like post pandemic dividends and monitor their stocks. In addition, they make the right decisions to ensure their projects are successful. Understanding the characteristics, which define undervalued stocks, can help you maximize your profits.

Big Data 103
article thumbnail

Beyond the Basics of A/B Tests: Innovative Experimentation Tactics You Need to Know as a Data or Product Professional

Speaker: Timothy Chan, PhD., Head of Data Science

Are you ready to move beyond the basics and take a deep dive into the cutting-edge techniques that are reshaping the landscape of experimentation? From Sequential Testing to Multi-Armed Bandits, Switchback Experiments to Stratified Sampling, Timothy Chan, Data Science Lead, is here to unravel the mysteries of these powerful methodologies that are revolutionizing how we approach testing.

article thumbnail

What Are n-grams and How to Implement Them in Python?

Analytics Vidhya

This article was published as a part of the Data Science Blogathon Dear readers, In this blog, we will learn what n-grams are and explore them on text data in Python. It’s completely alright even if you have never heard of the term “n-grams” before. We will study and implement n-grams right from scratch! The objective […]. The post What Are n-grams and How to Implement Them in Python?

article thumbnail

Using Building Analytics to Mitigate the Costs of Construction Mistakes

Smart Data Collective

Big data has been a gamechanger for countless companies in virtually every industry. The construction industry is no exception. Construction analytics is a new field that was worth just over $5 billion in 2018. It is growing at a rate of over 19% a year and should be worth $15.21 billion by 2026. Why is the construction analytics market growing at such a fast pace?

More Trending

article thumbnail

The Challenges and (Awesome) Benefits of Switching From Spreadsheets to Dataiku

Dataiku

About two years ago, I decided it was time I learned how to use Dataiku. Having been part of Dataiku's marketing content team for a while, I was, of course, familiar with our software at a high level, but at that point my job didn't require me to use it for creating insights so much as simply consuming them. Still, I was curious and eager to learn and, when an opportunity presented itself in the form of a bunch of really messy video marketing data, I jumped at it, and my journey as a Dataiku use

article thumbnail

Unique Data Visualization Techniques To Make Your Plots Stand Out

Analytics Vidhya

This article was published as a part of the Data Science Blogathon Visualization plays an important role in gaining quality insights from the data. Our traditional data visualization techniques are already playing a significant role in obtaining insights. But it’s always useful to bring and adapt new visualization techniques to create more appealing plots.

article thumbnail

Performance Dashboard: Facilitate The Performance of Your Business

FineReport

How to maintain high competitiveness in the violent and competitive digital environment? How to effectively manage complex data indicators? How to reasonably determine the training and promotion of employees? Performance dashboard can help you deal with various business problems. What is a performance dashboard? Performance dashboard is a data visualization tool for management, which is often used to measure employees’ performance, while helping business personnel measure, monitor, and manage t

article thumbnail

Windows OS Optimization Essentials Part 2: Microsoft Store

Nutanix

Operating systems can end up being a lot of work for administrators. Work to configure the image, work to install the applications, and work to provide the best user experience possible. As with any software, what is provided to you is what the developer intended, but not necessarily what you want or need for your end users.

article thumbnail

From Developer Experience to Product Experience: How a Shared Focus Fuels Product Success

Speaker: Anne Steiner and David Laribee

As a concept, Developer Experience (DX) has gained significant attention in the tech industry. It emphasizes engineers’ efficiency and satisfaction during the product development process. As product managers, we need to understand how a good DX can contribute not only to the well-being of our development teams but also to the broader objectives of product success and customer satisfaction.

article thumbnail

Operating Apache Kafka with Cruise Control

Cloudera

About Cruise Control. There are two big gaps in the Apache Kafka project when we think of operating a cluster. The first is monitoring the cluster efficiently and the second is managing failures and changes in the cluster. There are no solutions for these inside the Kafka project but there are many good 3rd party tools for both problems. Cruise Control is one of the earliest open source tools to provide a solution for the failure management problem but lately for the monitoring problem as well.

Metrics 74
article thumbnail

11 Examples of Good & Bad Data Storytelling

Juice Analytics

With the popularity of our list of 20 Best Data Storytelling Examples , we thought it worth finding some more data stories for inspiration. The good examples in this list demonstrate how to combined data visualization, interactivity, and classic storytelling. They show the importance of a clear message, supporting data and analysis, and a narrative flow to engage the reader.