Skip to content

2

To be fair I'm not entirely sure if we're supposed to write a blog this week considering we were having briefings mainly and there isn't much to conclude from it. However, I can use this uncertainty and combine it with the need to summarize the project's requirements. I'll be noting the most important parts that are necessary to focus on at the moment.

As Dr. Pless has mentioned many do not have a definite answer on what kind of data they will use for the project. The same thing applies to our case. Our project is related to the petroleum industry however we are not quite familiar with the field as we are computer scientist students. Therefore, research analysis is required to understand what data is needed and in what shape.

This raises another concern or risk with this data: how to obtain it. Coincidentally, our team has misunderstood the meaning of "risks" in Heilmeier Catechism as we thought they referred to the risks of the end product. Turns out by risks we were supposed to mention the possible issues we may face throughout the project period.

We have spoken to our colleagues at the workplace as we have already mentioned and they cannot provide us the data they have but they assure us that the data provided by the NPD is almost identical. Ofcourse, this may end up with another risk of integration with two different data sources in the future but if we trust each other it should not be too difficult.

Another note that Dr. Pless has mentioned (besides how long some of the presentations were) is that many students do not follow the Heilmeier Catechism structure for the briefings. Not completely sure if this also applies to us but we should work on the presentation's structure and improve it accordingly.

Now the main task we have in our hands is to research what kind of data is required in this particular topic and its shape. Then we need to make sure we can get this data so that we can proceed with the machine learning part of the project.

This time this blog will be consisting of two subparts that were requested by both Dr. Pless and Dr. Kaisler. Dr. Pless requested examples of the data we will or could use for the research needs, while Dr. Kaisler asked us to show few examples of what a computer cannot do. Hence, the name of the title.

The data:

I might have spoiled a little bit about this part in my previous blog where I mentioned that I, together with my project mate Farida, will be focusing on using the data provided by the Norwegian Petroleum Directorate (NPD). There are a few reasons for using this specific source of data. The first one is that it is an open data source for the petroleum industry that can be used for our project. In this link, one can find the needed information for their research in the specific field. It comes in various formats including the standardized CSV. Within the provided data, the NPD also provides extra useful information about the geographical positions of drilled wells. As shown in the screenshot on the right, there is a map showing activities related to the petroleum industry. In this link, one can see more detailed data in a written form.

Another reason why this data will be helpful for us is that the company we worked on uses a similar structure in their databases, which eventually will help us to easily integrate the systems in case of the successful end of the project.

Yet another positive side of using this data is avoidance of the need for the NDA and a less restricted working environment.

The flaws of the computers:

I think people keep having contradictive opinions on what computers are capable of. While some would argue they are better than humans, others would say they are not even close. And in all honesty, I think that computers are capable of both. Computers can do math very quickly without any mistakes (often). However, what they can't do is "understand". When they solve a problem, they don't know why they do it. It is just a direct algorithm to solve. They don't understand human things like feelings either. Or sarcasm. And I do not doubt that one day they will be able to do it and in my opinion, it will require a tremendous amount of data with similar examples but it will happen. Technological advancements never stop improving. This may cause some disagreements about how safe it is for a computer to be humanlike but until then we will not be able to tell certainly.

So, the first week has been concluded and therefore I should also summarize my newly gained point of view and knowledge. I won't be hiding the topics of interest I wrote in the class as it is not a big secret. It's the opposite, I would like to even emphasize why I wrote those topics.

I will start with the one I personally had an interest in for a long time: "AI in Health Industry using globally unified data from every hospital". I won't dive into the story of how I also wanted to be a doctor when I was a kid, but then fear of responsibility to deal with human life stopped me. But the idea of unification of experiences and results of all the hospitals maybe not in the whole world at first, but in one specific country sounds like a great opportunity to make healthy decisions easier and more helpful. With the AI in the sphere as well, it perhaps would've been even quicker to analyze one's issues and compare them to the rest cases. In the long run, unifying all hospitals would have helped less developed countries with their healthcare system and eventually lowering the number of death caused by incorrect or even inexperienced decisions of doctors. However, looking realistically at how the world works I understand that this "idea" probably will never see a light. Simply, it is impossible to make every country co-operate while it is nearly impossible to make one country to be responsible.

Now, let's get back to real problems that can be solved and my second topic of choice: "Application of machine learning in the petroleum industry". As I have mentioned previously, the company I worked at before coming to the US (and probably where I will be working later on after going back to Azerbaijan) is connected to the state oil industry. Implementation of ML in the said industry is not a new topic, but if we consider how many different areas the ML could be applied to, then there is always a need of bringing new implementations. While I'm still more familiar with computer science than the oil industry, it will be my responsibility to fully analyze the required fields and find that particular area where at the moment it is more required to use ML. There are few restrictions I have right now (or at least of those that I could identify yet). For example, a time limit is the main factor at the moment. I need to have some kind of outcomes at 3 stages within the next 10 months including the next spring semester. The first one is mid-august which is related to this class's requirement. NExt deliverable should be in December, and lastly fully working (prototype) by May 2022. Another limitation would be having enough data to apply the ML on. According to my company's policies, it's prohibited to use their data outside the company but there is also open data provided by the Norwegian Petroleum Directorate which could be used in the middle of research. Of course, this project by no means can be considered as a small one, therefore I must use all the knowledge in project planning I acquired within my bachelor's years and plan every step accordingly. Time, resources, risks, deliverables all should be well planned and documented.

I'm Mustafa Aslanov, born in Baku, Azerbaijan. Currently studying dual-degree master's program in Computer Science and Data Analytics at ADA and GWU universities. Before coming to the US, I was also employed as an Application Consultant at a company named Caspian Innovation Center, a joint venture by SOCAR (State Oil Company of Azerbaijan Republic) and IBM (International Business Machines Corporation).

I have studied for my bachelor's degree at ADA as well, that is how I learned about this new degree. Which at the time was very handy considering that closer to graduation I have evolved an interest in the data analytics field. As a part of my research thesis (or as we call Senior Design Project), I was, together with my university mate, analyzing texts of Azerbaijani data and creating an n-gram model for it. Essentially, it was a Natural Language Processing project. The same paper was published on IEEE Xplore later after my mate and I participated in AICT2020 Conference. While I cannot say that this paper is very impressive, I feel a little bit proud as it is my first published paper!

Even though I mentioned that I am no longer employed by that company, I will be likely returning after I am back to Azerbaijan and naturally I have developed an interest in the petroleum industry and wanted to try to apply machine learning in that field as a part of my future master's thesis. This will be done with my program mate, Farida, who was also my co-worker at the same company.

Besides the educational and working areas, I'm quite an introverted person and tend to be quiet. Which is a bad personality trait as I barely share my opinions in the required fields. But hoping that I will fix that soon! According to Spotify analysis, my favorite music genre is alternative rock. However, if you ask few people who heard the songs I listen to, they will call them "depressing" songs. They are just calm! Among my favorite artists are Linkin Park, Daughter (Ex:Re), Nothing But Thieves, Twenty One Pilots, and Ben Howard. The list can go on but I decided to only specify the top 5 🙂

1

Welcome to your brand new blog at GW Blogs.

To get started, simply log in, edit or delete this post and check out all the other options available to you.

For assistance, visit our comprehensive support site, check out our Edublogs User Guide guide or stop by The Edublogs Forums to chat with other edubloggers.

You can also subscribe to our brilliant free publication, The Edublogger, which is jammed with helpful tips, ideas and more.