<img height="1" width="1" style="display:none;" alt="" src="https://dc.ads.linkedin.com/collect/?pid=306561&amp;fmt=gif">

Be a Data Avenger not a Carpenter

Written by Fahd Maqba, Business Intelligence Lead, Emirates

Be a Data Avenger not a Carpenter

Written by Fahd Maqba on Apr 30, 2019 9:44:02 AM

CDAO Africa

Ok guys , heres my first blog and I hope it’s a good start. Happy to get your feedback whether be it positive or constructive, however my objective here is to share my ideas and communicate with you regardless of where you are located. The topic is about the analytical and data story telling.

Like any data analyst/ Business Intelligence analyst/ or just a mediocre analyst, the encounterment of “Microsoft Excel” in their daily work life is a necessity. I bet you on this, if you are to ask anyone in your team or yourself to provide a report or narrate a number story, the work commences from excel. Well that doesn’t mean you are doing wrong, but that’s how scalable and user friendly tool , Microsoft has to provide us with , isn’t it ? Like any other toolset of an analyst, one tool doesn’t solve all your problems. People often go into a perspective that 1 tool could solve all the data issues. Its like you expect a 1.6L car engine to perform like a Maserati. Highly not realistic isn’t it? And so heres some common list which I thought be useful and I would do best to make it simple and understandable to any one who is not just a data science fanatic but to anyone who wishes to become one of them , or even a normal analyst. And I certainly would recommend this be encouraged and shared across inorder to drive a data driven landscape in your job.

I think many companies don't have the data that we keep talking about. Well the buzz word- Big data, remember??! It so happens that so far with companies not understanding what data really is and what is data in your perspective for this? When we talk about surveys ( be it random sampling , stratified sampling etc), it is important to note that surveys are purely primary research based and opinion related pieces. I wouldn't count that as real data when it comes to analyzing and coming up with data driven insights and decisions. In every company now, with the adoption of cloud data layer is being separated from the application , it's a big movement out there. Everybody you talk to, a large aircraft manufacturer to a retailer, everybody's creating a data lake or a data pond, or some form of the ocean, some water body, But there's data in it. And, and what's happening out there is the intent of these data lakes was "let's bring all pieces of all variables together, and then let's figure out and insights to drive our business",

1)     However, when it comes to that insights layer, very few companies have really figured out how do I mine this data. It's not about putting data in Enterprise ERP solutions and then accessing it. It's about actually looking at the business problem framing the right business question. And yes, it has to be smart one as well ! Because, it is only then that a solution can be framed correctly. And I'm putting a lot of emphasis on this, because 95% of the projects that I come across by reading case studies and also through book reading , failed because they have not framed the problem correctly. Its like you have framed the planning perfect in theory but miserable fails in practical terms. Or there has been scenarios of going big bang than just picking the low lying fruits to harvest the quick wins.

And then thinking about, it's not about "Do I have all the data in one place or not?" It is about, "do I have the right data?" Well personally , to carry out my analysis, I always remind myself that - I don't need big data, I'm okay with small data, but do I have the right data? And if not, right, how a what algorithmic techniques am I going to use to actually analyze that low density data or figure out where the holes are in the data and fill up those holes. Or, second thing is, let me put some sort of a MVP (Minimum Viable Product) out there and then collect more data, create more hooks for that data.

2)     Another one very important piece that that most companies miss out is the interaction data. Think about an autonomous car. If It was put out on the street, the amount of data that you can have could be a third of what they we have today. After putting that on the street, you could start gathering more data and use that to make, the drivability of that car a lot more precise. And same thing applies to even your applications. You don't have to wait until you have everything. Remember, you were at zero.. So any progress even if it is 50%, it's pretty awesome. If It's 50% , then gather more data and start enriching it.

3)     Most of the times people think that first of all, people think that AI is technology. AI is not technology. AI is math. It's a different way of doing math. I am still in the process of getting the deep learning aspect of AI and in an exploration mode, however to what I have understood so far, the domain expertise, which is whether it is a business process, that of business expertise plays a vital role. Because if you do that math wrong, you got the business wrong. So the key is that the business has to own it. So that's, that's, you know, that's rule number one.

4)     when it comes to collaboration, you know, obviously, your technology. Organization becomes a big pillar there because they are going to provide you access to the data, etc. But then also one would say - well do I need tons and tons of data scientists or, AI experts to analyze this data? I would say you don't need tons actually, such as core data analyst and AI experts. They key term you would want to look at in an employee for producing enriched analytics is having scalable skills.

5)     But the key is that once you start solving one problem, think about creating the data pipeline? And what I mean by pipeline is, let's say that you're analyzing an invoice my earlier. For example you're analyzing to match quantity and price. Well there are 17 other elements on that invoice: timestamps, the date stamp, location, carrier, the trade lane, etc. But whatever it is, you should capture all that data and that is your data pipeline. Because next time when you want to analyze and come up with, say, a logistical planning problem, you have all these data sets. In addition it's all about models , perhaps you got to maintain the library of all those models. So the next time you're not reinventing the wheel, and that's right there - the scalability is what you seek to have.

6)     The last one is extremely important. If you cannot, and do not have the ability to put all these algorithms in production, , you will never be able to succeed. So it's very important that whenever you plan for solving these different business problems, how are you going to put it in production because it may require different architecture, it may require different way of planning on how you're going to feed the data and how you're going to take the insights out or it may require actually very different way of adopting it from process standpoint.

7)     Last but not the least, keep it very simple, understandable and user –friendly. Do not try to capture all fruits hwen you already have the low lying fruits that could be your quick wins. The fact that technology is only an enable, human will always drive the force. So a machine can only do what you wish to do so, but its only you who would be on the driving seat.

That’s about it, Incase if you found this article helpful, please hit the like button or hit me a message for providing your feedback as that will be much appreciated to keep me motivated to keep sharing more thoughts.

Food for thought - Do you require more data scientist/analyst who do all the technical work for you or people who could interpret or narrate story ?? I fear we may have a situation of an unbalanced sea - saw where we churn out a lot of data analysts whereas individuals who are interpreting results may be scarce. Something that i would think of in my next article write up. Your views ?

Related posts