Skip to content

Data Matters

In data science data means more than one thing. Strong data is a powerful start, a solid way to success, and meaningful results. It is easy to form a sentence to express data but in reality, this is like finding Alice in Wonderland, because great data is very rare to find and it is divergent, not restricted to certain rules, but still there are some pre-defined metrics to evaluate. It is lucky that neither I nor my teammate will be the first person to establish the path. There are many pre-built datasets out on the internet to be used by the public and Kaggle is one of the places to rely on.

However, there are some concerns to take care of while deciding on the dataset(we believe finding the correct dataset can guide us to the correct project definition that we would find a joy to carry on). It has been disclosed that all master projects in one or more ways have to be tied to local business needs and therefore, this should be the primary filter on finding the correct one. The latter one is set by ourselves, which indeed is having a project related to computer vision(literally, there was a third one as well about having the project related to medicine, but due to the reality of not finding appropriate business needs retained us from that idea). Currently, it is the examination period of the publicly available datasets to selects some of them. After having all the remaining choices, we will try to group them up on the reliability, relativity, and impact rates. Basically what it means is that the data must be relevant to Azerbaijan(for example, if doing sign language for Azerbaijani, we cannot choose an Indian sign language dataset), the dataset should have enough qualified data, and previously done researches on that data should reflect over 75 success rate.

It is obvious that since the start of the summer term, my approaches and thoughts about choosing a master project have changed positively. However, I believe that still there is a contradiction between ADA requirements and GWU project suggestions. In the classes, we spend time learning how to find some interesting project which will solve a globally existing problem and it is better not to have any repetition on this(for example, self-driving car project is not advisable because there are some people out there who has already done a great deal on this) but in term of ADA requirements, the idea should be bound to local issue and no need it to be something unique(maybe, it has already been done but not in this particular sphere). This contradiction still drawbacks me from making solid decisions and researches on finding the most skillful project that would force my boundaries when there is no guarantee that it will get accepted and rather to wait for the other project suggestion from the ADA side.

1 thought on “Data Matters

  1. pless

    I like this post because it is important to highlight challenges. Our (GWU) course is showing a very large set of possible problems. We know about our problems and world problems, so we can talk about those effectively. Then we hope that you can be inspired by these ways of thinking and find a problem you care about.

    So, good luck, propose some ideas, crazy or not, and ask lots of questions!

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *