Select Page

Georgia State University Text Mining Web Scraping & MySQL Queries Lab Report

Question Description

Recently, text mining has attracted lots of attentions in both academia and industry, especially in FinTech area. There are many algorithms which have been developed to either correctly predict company returns or to create reasonable investment index. Conference earning call scripts can provides lots of textual information for decision maker. It consists of both presentations and Question-Answering parts. Presentations is usually made by the executive staff from a company (usually CEOs); in Q&A part, participants like investors who will ask lots of questions and the company participants answering their questions.

The task is to build an entity relational database for the conference earning call script, named CECS.

1. First, design an ERD, using the normalization techniques. So, each entity in your entity relational database should be at least in 3-NF. Your database should contain at least the following attributes:

    1. Ticker: ticker name of the company
    2. Company, the detailed name of the company
    3. Title, the title of the conference call script
    4. Date, the date when the conference call script takes place
    5. Time, the specific timestamp when the conference call script starts
    6. Section: presentation or Q&A
    7. Speech: textual information (e.g., content)
    8. Participant name: the name of the participants
    9. Participant type: CEO, Company Staff, Analyst, or Others
    10. Participant Organization: the name of the participant organization
    1. (2) Decompose the above attributes (not limited to these 10 attributes) into different entities and each entity contains reasonable attributes
    2. (3) After your database schedule is completed, you should be able to scrape data from: https://seekingalpha.com/earnings/earnings-call-transcripts for year 2019 and 2020
    3. (4) Once your CECS database is completed, you should write SQL query to answer questions similar to the following:
        1. How many conference calls happen on the Q1, 2020.
        2. Given a ticker name, e.g., FUV, how many conference calls are in 2020.
        3. Given a ticker name and date, e.g., FUV, how participants in the conference call and whoare them? And further display their speech, given the name of a participant.
        4. Answer other pertinent questions utilizing SQL query
        5. Answer other pertinent questions utilizing SQL query
        6. Answer other pertinent questions utilizing SQL query
        7. Answer other pertinent questions utilizing SQL query

        **I need the above included with data schema, ERD, demo for queries, code for web scraping, and code for database construction**

"Place your order now for a similar assignment and have exceptional work written by our team of experts, guaranteeing you "A" results."

Order Solution Now