Novia Pratiwi - est.2021
14 min readApr 2, 2020

Week 1 — Journey to data (coming from non-technical background)

Introduction to data science

As we’ve gained control over the storage and management of data, the need has grown for more analysts to gather and analyze information. Data analysis and data science is crucial in all industries. Today, we produce more data every year, the need for data analysts will continue to grow. The shortage is real and it was predicted many years ago. The IDC global predicts by 2018 a need for 181,000 people with deep analytical skills and for the data analyst jobs that require basic data management and interpretation, the need is five times greater now. As a data analyst in the making, you do the math. Today, the demand way outnumbers the supply.

The career path to analysis and data science may come from various backgrounds, the first step to determine I will start with career planning to step into analysis and data science. Let’s start learning about data analysts, the roles, and skills, and truth about data. Seeing the need to understand, identify data, learn to deal with data you don’t have, learn how to work with source data, and the impacts of business roles to your data. This happened in most companies due to lack of right people and skills. At one point, learning to gather data, compiled them, and generate reports at both non-technical literacy using visualization.

Data analysts work with any one of those processes or all of them. Data analysts’ work is as diverse as the job descriptions given, while looking for data jobs, observing the pattern which might often start with lines of data in databases, spreadsheets or even CSV files. They turn those records into more meaningful results for businesses to interpret, meaning great fortune of sometimes creating powerful visualizations of data.

Occupations offered introduced to workplace that interest me to transition my career could be data analyst, Marketing Analytics, Social media analytics, and Data visualization developer. Dealing with data scientists, business intelligence architects, machine learning specialists, and business analytics specialists on a daily basis. Analytics are broad, my desire is to explore the business problem and solution in real time.

For example, a post on Facebook can accompany location information as well as timestamps. With this kind of unstructured but very rich data sets, a lot of useful insight can be derived about a customer who is posting and consuming information. IBM has a product called Personality Insights which offers a profiling service for companies that would like to know more about their customers. In the case of social media analytics, text mining and parsing are the very important and necessary first step. Social media companies often make their content available through their application programming interface, or API. Using this API, data scientists can retrieve the data they want. Collecting the social media data is one thing, but manipulating it for analysis purposes is another. A lot of skills and efforts are necessary before attempting to apply analytics methods, although it could be varied over time. In this table below describe some of identification to achieve them.

Table of core skills and attributes to data science (Skill / Attribute)

  1. Data manipulation techniques

(i.e. Data Literacy, Data Equity, Data Culture, Data Storytelling, visualization, and Data Management)

My Current Status

I took several courses aside from bachelor degree in IS. ‘Introduction to Business Analytics’ on my undergraduate Information Systems, online learning i.e Udemy platform on R.

- Intern in gaining real case/ opportunity basic hands-on experience and practical proficiency by using various BI/BA tools, such as SAS ERP enterprise Miner, generating monthly report using Excel and in-house CRM system.

- Participating on a hackathon Australia project online

My Future Plan

  • Learn from data scientist and data analyst from LinkedIn network. Utilize LinkedIn learning to gain some knowledge on becoming a data analyst and data scientist path.

- Review and be familiar with KNIME and master in one coding language for short-term (a year) and expand in the next 3–5 years like programming ( e.g. Python and R).

2. Soft skills (i.e. effective communication, critical thinking analysis, perseverance, and creativity)

My Current Status

- Read news on Artificial Intelligence, machine learning, and analytics tools.

- Do unpaid project to get to the entry level job as an international student

- Write my journey and blog regularly here on Medium which helps to hone my written communication skills.

- Connect with relevant colleagues and postgraduate student (on both vertical and horizontal level)

- Find out and examine the relevant material can be sourced from books, journals and the Internet

My Future Plan

  • Sign up for a mentoring program
  • Open up to connect to more people via LinkedIn and social meetup

3. Statistics/Mathematics model

My current Status

  • Volunteering to study and mentoring high school students. This will refresh some of the theoretical and mathematical models that I used to learn.

My Future Plan

Current status and evidence

Data mining is a broad term referring to the practice of examining a large amount of data for the purpose of finding meaningful patterns and establishing significant relationships to help solve a problem.

Having a relevant tertiary qualification and/or experience working in data analysis, and can demonstrate your knowledge and experience of databases, data analysis software, automation and programming ( e.g. SAS, SQL, Python, XML) to interrogate significant amounts of information, undertake complex analysis and create high-quality reports that deliver strategic insights in a fast-paced environment. and will enable you to acquire a better understanding of the course. The readings, self assessment exercises and your own topic summaries form the basis of an excellent private study regime. Keeping up to date is very important and each week builds on the prior weeks so it is important that you get your study regime organised quickly. This will enable you to acquire a better understanding of the course. The readings, self assessment exercises and your own topic summaries form the basis of an excellent private study regime. Keeping up to date is very important and each week builds on the prior weeks so it is important that you get your study regime organised quickly.

- Find some examples and stories in awareness campaigns to teach and inspire others about how to use data well. When the company uses data to make a major change in strategy, present the results to the entire organization so people see examples of how to do this well. For example, Starbucks applies AI to mobile app data to optimize suggestions for new beverages to customers. Netflix uses its vast data sets not only to personalize content on the platform, but also to help create compelling content. PepsiCo uses data visualizations to make high-stakes sales decisions.

- Self learning and development of self-management skills.

- Relationship management is similar to joining and helping community in networking events as a seminar, and not as a series of lectures. This approach enables experts and beginners in Business Intelligence to work together in a collaborative fashion.

Capable toresentation models on Marketing, pre-sales, solution selling, software as a service (SaaS), requirement gathering, pre-sales technical, technology pre-sales, technical specs, presentation skills, and consulting skills

There are a number of underlying technologies that make data science a career path. These include both technical and non-technical capabilities: data infrastructure or data management, Emotional Intelligence (EQ) basis such as character, communication, and researching skills leveraging visualization technologies. In enabling technologies into Digital Transformation in Business, data infrastructure technologies support how data is shared, processed and interpreted. One of the basic qualifications is a university’s background degree or equivalent, with substantial relevant experience including working experiences within a university research or research related division. Education and certifications would be necessary to recognise our passion. Most popular data infrastructure technologies which data scientists use today is distributed computing in general and in particular cloud computing.

Once you know you’re ready for the task of diving into your journey of training and educating yourself on data science, the next big step to take is to actually commit yourself to lifelong learning. Identify a degree program or online curriculum that will provide a roadmap to your ultimate goal of becoming a data scientist, and simply plunge into it. If you need more guidance, find a mentor who could coach you along the way. This could be your professor, colleague, or someone you get to know through a LinkedIn invitation. It is very too late and best to start with developing passion for data science by first getting exposed to the field as much as you can.

There are key underlying technologies that enable cloud computing. Virtualization is one of them, distributed file sharing is another. In particular, redundant array of independent disks or RAID and Hadoop distributed file system or HDFS are prominent ones. Data Management is handled by database management systems or DBMS. Data Science requires highly scalable, reliable, and efficient ways to store, manage, and process data. Which is why DBMS plays a critical role in data science. As big data becomes mainstream, unstructured data is also becoming more prevalent. In fact, the majority of business related data is unstructured. It consists of word processing, presentation, log files, and so on. However, a significant portion of our data is still stored in conventional relational DBMS and in a structured data format. As a result, the new generation of data science professionals have to be versatile enough to be able to deal with both unstructured and structured data sets. Knowledge in SQL is still invaluable in the context of data management. Once data analysis is over, the newly acquired insight needs to be conveyed to the leadership and the rest of an organization. No matter how significant the discoveries are, if data scientists fail to communicate them effectively, especially in the context of strategic goals of the organization, their impact will be minimal. This completely beats the purpose of various data science efforts made in support of the organization.

What we can do next!

Getting a certification to recognise the software products you’re required to use on a daily basis, and include certifications such as MCSE Business Intelligence, Cloudera Certified Professional, EMC Data Science Associate, Oracle BI Implementation Specialist, and SAS Certified Data Scientist certifications.

Cloudera offers four big data related certifications. They are Cloudera Certified Professional or CCP data scientist. Cloudera Certified Developer for Apache Hadoop or CCDH. Cloudera Certified Administrator for Apache Hadoop or CCAH. And Cloudera Certified Specialist in Apache HBase or CCSHB.

SAS offers multiple tracks for their certification programs. They are: certified big data professional; advanced analytics certification; and certified data scientist.

INFORMS stands for Institute for Operations Research and the Management Sciences, and offers the Certified Analytics Professional or CAP certification. CAP is unique, in a sense that it is not tied to any specific software product or vendor, unlike many other data science and analytics certifications made available so far.

salary statistics due to its diverse and fluctuating nature let me give you some anecdotal snapshots of how well specific types of data science and analytics professionals are getting paid. Entry level data scientists receive above $80,000 per year. Mid-level data scientists make around 120k, while senior data scientists bring home close to 150k according to 2018 Australia’s employment bureau.

Data science still has a broad definition. I believe it will always be a moving target, because data evolves, the tools we use evolve, and the skills required evolve. In my opinion, the shortages we have today is because we tied the word data to science, and that can immediately make people feel disqualified, but don’t feel that way; just keep learning. It is either for you or not, you just have to get through the fundamentals. A data analyst is a key feature to any data science team, and really, any team. Data analysts sometimes work independently, but you may find yourself on a team that has statisticians, researchers and other analysts. One of my favorite teams to work on has an economist, an epidemiologist, and even a historian. It’s amazing to have an opportunity to work with people with these skills, and they need me, and I learn from them. Your role might only require you to bring data to the team, but it could easily expand to doing initial analytics or even visualizations. You may find your team will share some common skills with each other, but they will also have their own unique skills and thoughts to bring to the table, or should I say data. A good team is working towards the same goal: analyzing, building measures, and providing information that can improve outcomes.

- Benefits seen in various industries: healthcare, simulation imitates the operative of a real world system. The true power of simulation comes from its predictive nature. A computer simulation can completely rely on a mathematical model, can be interpreted into an algorithm, and then finally implemented into a piece of code

- Fraud detection, disease control, climate research, and network security — internetworking

- Education Analytics, Government Analytics, IT Analytics, Marketing Analytics, Insurance Analytics, High Technology Analytics, Amazon Web Services

Credible Plan Ahead

Essential skills in the workplace

<<Intro>> helps to start with some obvious ones such as data mining, machine learning, natural language processing, statistics, and visualization. Data mining is a broad term referring to the practice of examining a large amount of data for the purpose of finding meaningful patterns and establishing significant relationships to help solve a problem.

  1. Data manipulation techniques. Text retrieval is one of the most well-known data mining techniques. It builds on many foundational concepts and methods developed by Natural Language Processing, or NLP. worthy users of an online banking system. Prediction builds a model that produces continuous or ordered values that form a trend. For instance, a prediction model can provide estimated mean time to failure or MTTF values for a computer. Clustering is a process of grouping similar data objects into a class. Clustering helps reveal features that distinguish one class of data objects from the other, leading to new discoveries on a dataset. Uses of clustering analysis range from pattern recognition and image processing to market research. For example, clustering can reveal people of similar purchasing behaviors. As you might have noticed already, the difference between classification and clustering is that classification starts with predefined labels while the labels are created after the fact for clustering.
  2. At a minimum, a data scientist needs to be proficient with concepts such as probability, correlation, variables, distributions, regression, null hypothesis significance tests, confidence intervals, t-test, ANOVA and chi-square. You also need to know how to use common statistical analysis tools, including R, Excel and SAS. At a more advanced level a data scientist needs to be familiar with concepts and algorithms, like logistic regression, support vector machines, or SVMs, and Bayesian methods.
  3. Effective communication — like storytelling and generating report. The understanding of visualization are readily available. These include displaying data at multiple levels of details, and avoiding distorting the message to be conveyed while attempting to visualize it. It is also very helpful to know how to use some of the software tools offered by the industry leaders of visualization solutions. For example, Tableau offers one of the most popular and comprehensive visualization tools for data scientists. For example, Amazon has a service called QuickSight.
  4. Storytelling and creative visualisation

Current jobs and unpredictable future job

an all-encompassing job title. Data is everywhere and its volume is also ever increasing. Every organization can benefit from hiring a person who can provide data analysis and analytics to reflect on its past performances and to attempt to predict its future.

There are job titles such as data scientist, data engineer, business intelligence architect, machine learning specialist, data analytics specialist, and data visualization developer.

Take network security. Let’s assume that you need to analyze a terabyte of data every day. The goal here is detecting suspicious behavior. There are numerous roles involved in this including domain experts, such as cyber security professionals, database administrators, cloud and distributed computing specialists, network engineers, software engineers, and last but not least, data scientists.

Referring to a data science job is relatively lower. A solid training in computer science or statistics, may be enough for you to get started at an entry-level position. But a Master’s degree in data science is a big plus. In this case, what’s more important is your passion for data. Also, a potential for job growth is very high as you become a seasoned data scientist and take on various leadership positions in a company.

Responsibilities and Salary Compensation

  1. Creative and Independence

At its highest level the job requires you to be highly creative and independent. Nobody can tell you what to do when your job is to enhance customer interactions at a multi-billion dollar company through machine learning techniques.

  1. Discipline

You also need the discipline to follow through and meet deadlines.

2. Attention to details and quality

As small, seemingly insignificant mistake can cause a havoc on your entire project when you have to deal with millions of unhappy customers.

Technical skills:

  1. Math skills are essential because they form the foundations of the technical language machine learning scientists use. In particular, deep knowledge of statistics and probability is important.
  2. Ability to develop and validate a mathematical model representing various aspects of machine learning. Once a model is developed, it needs to be translated into an algorithm or unambiguous and discrete processes the computer can execute.
  3. Practical IT skills. Proficiency in programming languages, such as Python, C++, Java and R is very helpful. Your work efficiency as a machine learning scientist is often dependent upon your ability to preprocess a large amount of text very quickly and efficiently. Therefore, your familiarity with Unix Linux tools like sed, awk, grep, find, and sort is highly useful.
  4. Understanding of distributed computing: your machine learning program will most probably have to take advantage of technologies such as Hadoop and cloud computing. Naturally there is a growing need for data processing. Machine learning scientists are at the forefront of this kind of efforts for leveraging the data around us, so that we can store data more cheaply and easily, there is an increasing number of data sources available to us. These include images, videos, maps, networking data, social media data and so on.

Interview questions

  1. Which data analysis software are you well-versed in?

Be prepared to be both business and technology savvy! One of the core responsibilities of a data analyst and other BI architect is to design and implement system architectures to maximize the potential of a company’s data assets. To make this happen, the BI architect needs to be able to build a system that links various standalone IT systems throughout a company to pool relevant and useful information for strategic decision-making.

2. What are some of the data projects that you worked before.

Skills and experiences that are listed in the job advertisement.

3. Why did you go into data analysis?

  1. Why do you want to work for us? <Specific to the job advertisement and IT industry>
  2. How do you keep your technology skills current? What online resources do you use to help you do your job?
    This tech interview question can help you gauge the candidate’s enthusiasm for the profession, as well as open up a conversation about professional development…
  3. What is your process when you start a new project??
  4. Can you tell me about a time when things didn’t go the way you wanted at work, such as a project that failed or being passed over for a promotion? Time pressure, working in a project, communicating with a team, and
  5. What would you hope to achieve in the first six months after being hired? This is likely to be asked in the intern or entry level of the job,
  6. What kind of work environment suits you best? Are you comfortable working remotely or on a flexible schedule?
  7. What are your strengths? Could be technical and soft skills

Data Industry Career Consultation

Schedule a call with a data industry consultant to discuss your career goals and
discover which industry-focused training program is the right match for you.

Sign up to discover human stories that deepen your understanding of the world.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

Novia Pratiwi - est.2021
Novia Pratiwi - est.2021

Written by Novia Pratiwi - est.2021

Curiosity to Data Analytics & Career Journey | Educate and inform myself and others about #LEARNINGTOLEARN and technology automation

No responses yet

Write a response