IBM Watson Analytics и стандартные экономические пакеты: сравнительный анализ пригодности для финансового сектора

Файзуллов Ильяс Рафисович

IBM Watson Analytics и стандартные экономические пакеты: сравнительный анализ пригодности для финансового сектора

Большие данные влияют на многие сферы бизнеса, в том числ и на торговые операции на фондовом рынке. Стандартные статистические пакеты не способны эффективно обрабатывать огромные объемы данных, поступающие из разных источников в разных форматах. Решение этой проблемы - аналитические платформы. В данной работе был применен метод аналитического иерархического процесса для определения наиболее предпочтительной аналитической платформы для прогнозирования фондового рынка. В результате применения данного метода, была определена такая платформа - IBM Watson. Далее, было проведено эмпирическое сравнение возможностей IBM Watson и стандартных эконометрических моделей прогнозирования фондового рынка.

Общественные науки в целом

Диссертации

Вуз: Санкт-Петербургский государственный университет (СПбГУ)

ID: 587d366f5f1be77c40d58f1a

UUID: ec3c8ccc-02d3-489e-bd11-4b2e3bc9fa2d

Язык: Русский

Опубликовано: почти 8 лет назад

Просмотры: 27

Файзуллов Ильяс Рафисович

Источник: Санкт-Петербургский государственный университет

Комментировать 0

Рецензировать 0

Скачать - 1252772 bytes

Поделиться работой

St. Petersburg University Graduate School of Management Master in Management Program IBM Watson Analytics vs. Conventional Econometrical Software: A Comparative Analysis of Suitability for Financial Sector Master’s Thesis by the 2nd year student Concentration — Information Technologies and Innovative Management Ilias Faizullov Research advisor: Associate professor, Sergey A. Yablonsky St. Petersburg 2016 1

ЗАЯВЛЕНИЕ О САМОСТОЯТЕЛЬНОМ ХАРАКТЕРЕ ВЫПОЛНЕНИЯ ВЫПУСКНОЙ КВАЛИФИКАЦИОННОЙ РАБОТЫ Я , Файзуллов Ильяс Рафисович, студент второго курса магистратуры направления «Менеджмент», заявляю, что в моей магистерской диссертации на тему «IBM Watson Analytics и Стандартные Эконометрические Пакеты: Сравнительный анализ пригодности для финансового сектора», представленной в службу обеспечения программ магистратуры для последующей передачи в государственную аттестационную комиссию для публичной защиты, не содержится элементов плагиата. Все прямые заимствования из печатных и электронных источников, а также из защищенных ранее выпускных квалификационных работ, кандидатских и докторских диссертаций имеют соответствующие ссылки. Мне известно содержание п. 9.7.1 Правил обучения по основным образовательным программам высшего и среднего профессионального образования в СПбГУ о том, что «ВКР выполняется индивидуально каждым студентом под руководством назначенного ему научного руководителя», и п. 51 Устава федерального государственного бюджетного образовательного учреждения высшего образования «Санкт-Петербургский государственный университет» о том, что «студент подлежит отчислению из СанктПетербургского университета за представление курсовой или выпускной квалификационной работы, выполненной другим лицом (лицами)». _______________________________________________ (Подпись студента) 25.05.2016 (Дата) STATEMENT ABOUT THE INDEPENDENT CHARACTER OF THE MASTER THESIS I, Faizullov Ilias, (second) year master student, program «Management», state that my master thesis on the topic « IBM Watson Analytics vs. Conventional Econometrical Software: A Comparative Analysis of Suitability for Financial Sector », which is presented to the Master Office to be submitted to the Official Defense Committee for the public defense, does not contain any elements of plagiarism. All direct borrowings from printed and electronic sources, as well as from master theses, PhD and doctorate theses which were defended earlier, have appropriate references. I am aware that according to paragraph 9.7.1. of Guidelines for instruction in major curriculum programs of higher and secondary professional education at St.Petersburg University «A master thesis must be completed by each of the degree candidates individually under the supervision of his or her advisor», and according to paragraph 51 of Charter of the Federal State Institution of Higher Education Saint-Petersburg State University «a student can be expelled from St.Petersburg University for submitting of the course or graduation qualification work developed by other person (persons)». ________________________________________________(Student’s signature) 25.05.2016 (Date) 2

Table of content Introduction..................................................................................................................................... 4 Chapter I. The state of art predictive analytics................................................................................ 6 1.1 Predictive analytics and big data.......................................................................................... 6 1.2 Predictive analytics..................................................................................................................10 1.3 Social Media and Business news Analytics.............................................................................12 1.4 Market of predictive analytics tools in financial sphere..........................................................13 1.5 Research gap............................................................................................................................14 1.6 Research methodology and organization of the study............................................................. 15 1.7 Conclusion of Chapter I...........................................................................................................16 Chapter II. Research framework....................................................................................................18 2.1 Research goals, KPIs, objectives, questions, and limitations.................................................. 18 2.2 Methods of evaluation of advanced analytical platforms........................................................ 20 2.3 Methods of comparing the forecasting accuracy of IBM Watson and statistical packages.....21 2.4 Method of currency exchange rate forecasting using Statistical Packages.............................22 2.5 Methods of stock forecasting using Statistical packages....................................................23 2.6 Conclusion of Chapter 2..................................................................................................... 25 Chapter 3. Empirical estimation of analytical platforms............................................................... 26 3.1 Evaluation of the Analytical Platforms....................................................................................26 3.1.1 Justification of the choice of analytical platforms taken for consideration..........................26 3.1.2 Results of Evaluation of Analytical Platforms................................................................28 3.2 Evaluation of the forecasting accuracy of IBM Watson Analytics..................................... 30 3.2.1 Data description.................................................................................................................... 30 3.2.2 Forecasting stock prices with theoretically based models....................................................32 3.2.2.1Results of the Random walk models for currencies........................................................... 32 3.2.2.2 Currency’s exchange rates forecasting using factor models........................................... 33 3.2.2.3 Stock forecasting using CAPM model.............................................................................. 34 3.2.3 Forecasting stock market using IBM Watson analytics........................................................35 3.2.3.1 Models for stock forecasting............................................................................................. 35 3.2.3.2 Models for currency’s exchange rate forecasting........................................................... 39 3.2.3.3 Analysis of the results of stock price forecasting.............................................................. 42 3.3 Conclusion of the Chapter 3.................................................................................................... 43 Final Conclusions.......................................................................................................................... 44 Discussion of the findings............................................................................................................. 44 Theoretical implications................................................................................................................ 45 Managerial implications................................................................................................................ 46 Limitations.....................................................................................................................................46 List of references........................................................................................................................... 47 Appendix 1. Specifications of Models.......................................................................................... 50 Appendix 2. Specification of models, suggested by Watson Analytics.........................................53 Appendix 3. Results of the IBM Watson Analytics Predict function for currencies.....................65 Appendix 4. Results of the IBM Watson Analytics Predict function for stocks............................68 3

Introduction Market of financial analytics is a fast growing niche. It was estimated in Financial Analytics Market forecast conducted by Research and Markets (2014), that by 2018, total market value of financial analytics will reach the mark of 6.65 billion dollars. At the moment, many of different players are struggling to capture a share of the market, among them are such a renowned giants as IBM and Microsoft. Such a rapid growth of the financial analytics market is driven by the need of financial organizations to manage increasing amounts of structured and unstructured information coming from different sources, as it states Srivastava (2015). In other words, emergence of big data creates a market for advanced analytics. One of the spheres of financial analysis, which attracts attention of both financial organizations and individual traders, is stock price forecasting. Main characteristic of any financial assets, which is available for all participants of the market, is its price. These prices can be represented as prices of purchase of bonds and stocks, as currency exchange rates, or as interest rates of a bank deposit. The whole assembly of all these values at any given moment in time comprises the conjuncture of the market. There are three main classic methods of the stock price’s dynamic prediction: Technical Analysis, Fundamental Analysis, and Quantitative Analysis. According to Schwager (1996), Technical analysis is based on the examination of historical trends on the market, which are represented by the market statistic of stock prices and volumes. Technical analysis operates under the assumption that all available and relevant information, including so-called fundamental factors is reflected in the asset’s prices. In addition, technical analyst assumes that some patterns of the stock market are repetitive and can be revealed using indicators, oscillators, and other “technical” methods. The shortcoming of such a methods is an absence of a systematic and scientific basing of the majority of its empirical methods. Another approach is the Fundamental Analysis. It is based on the evaluation of the fundamental macroeconomic and microeconomic factors. Niemira (1998) claims that, fundamental analysis focuses on the condition of the issuer, on its revenues, market position, etc. Macroeconomic factors, influencing the whole industry and the country (GDP, Unemployment rates, and so on), are also taken into consideration. The third classic approach to the stock market analysis, as it was described by Curthberston (1996), is Quantitative Analysis, which is based on statistical data, just like technical analysis, but instead of indicators, it uses statistical and mathematical models and tools, which are also referred as econometrical. 4

A new approach has emerged in the last years – the predictive analytics. It has gained attention due to the increasing amount of the available and relevant to the market information. Mark E. (2006) has estimated, that in 2007-2009 the humanity has generated more information than in the previous 1000 years. This information overload caused the emergence of the term “Big Data”, which refers to the high-volume, high-velocity, and high-variety data. Predictive analytics is a quantitative analysis per se, but with the ability to use it on the “big data”. It uses the same statistical and mathematical tools as quantitative analysis, however, it differs in the research approach: while standard econometric models just test pre-generated, based on theory hypothesis, predictive analytics is capable of finding correlations between variables in huge datasets without preliminary hypothesis i.e. predictive analytics generates its own statistical hypothesis based on the data. Big Data creates challenges as well as opportunities, financial organization, such as banks have a lot to gain from analyzing Big Data, as Tian (2015, 34) argues: “The large scale of data contain enormously valuable information, and analytics based on big data can provide financial organizations with more business opportunities and the possibility to gain a more holistic view of both market and customers. Big data analytics can benefit banking and financial market firms in many aspects, such as accurate customer analytics, risk analysis and fraud detection. These approaches can lead to smarter and more intelligent trading, which can help organizations to avoid latent risks and provide more personalized services, thus to get a higher degree of competition advantage”. Challenges of analyzing vast amount of high volume, high velocity, and high variety data, which is also presented in both unstructured and structured form, create the need for an advanced analytical tool. Nowadays, there are multiple analytical platforms available for banking and other financial organizations. Such giants as IBM, Microsoft, Google, and Amazon are offering their analytical products to the market. According to Gartner’s Magic Quadrant of Advance Analytical Platforms (2014), the leading position on the market of analytics platforms belongs to the IBM Corporation, RapidMiner, and SAS. Such a giant as Microsoft is lagging behind, but in a past two years it has showed positive dynamics and now it is catching up with the leaders. The goal of this research is to determine which of these analytical platforms fits better for fit for the purposes of stock market forecasting. In the theoretical part, we will discuss the influence of the big data and predictive analytics on financial organization’s operations. Then we will define the requirements of these organizations to an analytical platform, and generate the set of KPIs to evaluate the platforms. Among other KPI’s we will pay attention to the ability of analytical platforms (Using IBM Watson Analytics as an example) to generate predictive models for stock prices forecasting. 5

We will compare the results with the outcomes of some of the traditional, theoretically based econometric models. Chapter I. The state of art predictive analytics. 1.1 Predictive analytics and big data. Predictive analytics is connected with the term “Big Data”, which has become popular in the past decade as it shown by Jianzheng (2016); figure 1 illustrates the raising academic interest to the subject. Figure 1. Dynamics of the number of published studies on Big Data. Source: Jianzheng (2016). There is a confusion among executives around the world, regarding the question what Big Data really is. As it is shown on the figure 2, according to research conducted by SAP (2012), the majority of executives perceive big data as an increased amount of customer related information, which requires processing (28% of respondents), and almost a quarter connects Big Data with the technologies for processing vast amounts of information. TechAmerica Foundation defines big data as follows: “Big data is a term that describes large volumes of high velocity, complex and variable data that require advanced techniques and technologies to enable the capture, storage, distribution, management, and analysis of the information.” 6

Another definition of the big data we can find at the Gartner IT Glossary: “Big data is high-volume, high-velocity and high-variety information assets that demand cost-effective, innovative forms of information processing for enhanced insight and decision making.” Figure 2. Definitions of big data based on an online survey of 154 executives in April 2012. Source: SAP (2012) Both definitions describe the Big Data as a data, which possess three qualities, or, as they are also called, three V’s: Volume, Variety, and Velocity. Volume is a relative characteristic of the Big Data, as it tends to increase over time: what is considered as huge volume today may not meet the requirements of being “Big” in the future, for example ,in 2012 a dataset over a terabyte was considered as a Big Data, says Schroeck, M (2012). Variety of the data means structural heterogeneity of the Big Data, which consist of many data formats. As Cukier, K (2010) claims, only around 5% of the data is structured, other 95 % are unstructured and represented mostly by audio, video, and text formats. Unstructured dada can not be analyzed by the machinery, therefore it poses serious challenge for an analysist. Velocity refers to the speed at which the data is generated. The rise of the digital technologies has led to the increase of the information generation rate, making the analysis of the market even more complicated. There are another V’s of the Big Data, introduced by the IBM, SAS, and Oracle: Veracity, Variability and Complexity, and Value. 7

Veracity refers to the unreliability of the data, for example, social media sources are unreliable by nature, as they are generated by the broad masses of people. Variability and Complexity refers to the unsteady rate of information generation and the diversity of the sources it comes from. Analyzing multiple information flows, which are coming at the different rates and have their own cycles, downs, and peaks, drives the need for the advanced analytics tools. Finally, the last V – Value. The low share of the valuable characterizes big data; nevertheless, the overall value of the whole dataset is high, as the volume is immerse, which also supports the need for an appropriate analytical tool. All this V’s are not constant, they varies over time and an industry, they are also interdependent, if the one changes, others will be influenced as well. It will be a mistake to pay more attention to the first V – volume. Other V’s are no less important. As Jagadish (2015, 50) claims, the main reason volume of the data gets more attention is that it is easily measurable, unlike variety and velocity: “I have discussed above, why Volume (or size) gets undue attention. Let me turn now to why I think Variety and Veracity do not get the attention they deserve. One major reason for this lack of attention is that there is no wellaccepted measure for either. If there is no measure, it is hard to track progress. If I have a company and develop an innovative system that can handle a slightly larger volume than the competition, I can show this off with measurements against some benchmark. If I am an academic and develop an algorithm that scales better than the competition, I know exactly how to compare my algorithm against the competition and persuade skeptical reviewers. In contrast, consider variety. If I have a product that makes handling variety a little easier, what technical claim can I make that doesn’t sound like marketing hype? If I write a paper about a data model that is better at handling variety than the current state of the art, I have to think very hard about how I will compare against the competition and establish the goodness of my idea. Progress is hard in things you cannot measure, in both industry and academia. Variety may be the hardest of the 4Vs to address, but it is the one that people are least motivated to speak about.” There are different techniques for different types of big data being analyzed (structured or unstructured). Types of Big Data analysis methods are as follows: Text analytics, Audio analytics, Video analysis, Social Media analytics, and Predictive analytics. Audio analytics mostly consist of speech analysis, which is aimed at tracking customer’s feedback, as Gandomi (2015, 141) claims it: “Call centers use audio analytics for efficient analysis of thousands or even millions of hours of recorded calls. These techniques help improve customer experience, evaluate agent performance, enhance sales turnover rates, monitor compliance with different policies (e.g., privacy and security policies), gain insight into customer 8

behavior, and identify product or service issues, among many other tasks. Audio analytics systems can be designed to analyze a live call, formulate cross/up-selling recommendation based on the customer’s past and present interactions, and provide feedback to agents in real time.” Video analytics is the least developed brunch, but it bears potential for customer’s behavior analysis, as Gandomi (2015, 142) states: “…potential application of video analytics in retail lies in the study of buying behavior of groups. Among family members who shop together, only one interacts with the store at the cash register, causing the traditional systems to miss data on buying patterns of other members. Video analytics can help retailers address this missed opportunity by providing information about the size of the group, the group’s demographics, and the individual members’ buying behavior.” Text, Social Media, and Predictive analytics are relevant for stock market forecasting, so we will shortly discuss them. Text analytics deals with all kind of written sources such as news, blogs, emails, documents and so on. Text analytics derives the main ideas out of huge amounts of textual data by creating summaries. Chung (2014) supports the idea that this technique can be used for stock market forecasting, as it can forecast price movements based on financial expert’s sentiments. According to Gandomi A. (2015), Text Analytics techniques include: 1. Information extraction – converting unstructured textual data to constructed one. 2. Text summarization – a technique, which generates meaningful summaries out from texts, using Natural Language Processing methods. 3. Question answering – another technique, using Natural Language Processing Methods. It provides answers to questions, formulated in a natural language, by going through three steps: question processing, text processing, and answer processing. 4. Sentiment analytics – a method, aimed at deriving aggregated customer or expert’s opinion regarding some product or events. It operates by classifying opinions as either negative or positive; then, based on the score of these both classes the overall sentiment is determined. Social media analytics is used primarily for marketing purposes such as customer’s satisfaction analysis, community detection, an etc., as social networks provide great opportunities for the target audience analysis. However, it also could be used for stock market forecasting, for example, Antweiler W. (2004) has conducted a study that showed that Yahoo finance message board could be used for stock prices prediction. Finally, predictive analytics, which includes a variety of quantitative methods that can be used for prediction of almost everything, from the crime rates to the stock market volatility. 9

Predictive analytics techniques classify in two categories: auto regression and regression analysis. The first type discovers patterns within the chosen variable history; second one exploring dependencies between different variables. Increased academic attention to the “Big Data” can be explained by the advancement of computation technologies. Modern data mining tools made it possible for researchers to work with huge amounts of structured and unstructured data. Christine E. Earley (2015, 494) supports this statement: “The availability of large amounts of computerized data in companies has been steadily increasing over the years, but recent advances in processing speed, cloud storage, and the rise of social networks has changed the ease of access to data and the nature of data that can be captured and stored for later use. At the same time, software used to analyze large volumes of data (i.e., data mining tools) as well as more sophisticated data visualization tools can potentially increase the ability of individuals to understand the story that the data is telling them”. 1.2 Predictive analytics. Matlis J. (2006, 42) gives the definition of the predictive analytics as follows:” Predictive analytics is the branch of data mining concerned with forecasting probabilities. The technique uses variables that can be measured to predict the future behavior of a person or other entity. Multiple predictors are combined into a predictive model. In predictive modeling, data is collected to create a statistical model, which is tweaked as additional data becomes available.” As it is evident from the definition, predictive analytics uses the same statistical methods as quantitative analysis, but the difference between them is in the sequence of the research steps. Joe F. (2007) describes processes of quantitative analysis and predictive analytics as follows: Quantitative analysis steps: 1. Theory 2. Hypotheses Development 3. Test Predictive analytics steps: 1. 2. 3. 4. Data Relationships Development Hypotheses Model Building Test Hypotheses 5. Model Validation As we can see, Predictive analytics offers more possibilities for analysis, as it can find interdependencies that otherwise could have been overlooked. The difference between predictive analytics and quantitative analysis can be represented from explanatory vs. predictive modeling perspective. 10

Explanatory statistical models test predefined hypothesis based on theory. The role of explanatory statistic is to show the casual dependencies between variables. In order to build an explanatory model one should firstly identify the cause and effect relationships between variables, and then build model for testing of his/her hypothesis. In other words, explanatory statistic is used for proving that revealed connections between factors and depending variable are relevant. For an evaluation of such models, analysts use statistical tests, such as, R-squared etc., which measure explanatory power of a model. Predictive models have different constructing mechanism: instead of focusing on theory based casual links between variables, predictive models are based on association links between variables. Predictive analysis, unlike explanatory, starts with the data. Then it looks for associations between variables within the dataset and build forecasts based on the findings. Evaluation of predictive models is based on measuring predictive accuracy, instead of explanatory power. Shmueli (2010) points out four criteria which differs predictive and explanatory analytics: “… causation-association, theory-data, retrospective-prospective, and bias-variance”. Biasvariance perspective refers to the different evaluation criteria for predictive and explanatory models: first seeks to minimize sample variance, whereas the latter minimize model’s bias. Both approaches (explanation and prediction) are hardly compatible within a single model, as best explanatory model is not the best predictive one, argues Konishi S. (2007), despite the fact that it has some level of predictive power. Predictive models increase their accuracy at the cost of higher bias, therefore, prediction models are not necessarily are “true”, in a sense that there may not be theoretical foundation for them. Since predictive analytics operates on the big data, it inevitably face challenges, which Fan, J. (2014) has identified as follows: 1. Heterogeneity. Data obtained from the multiple sources and in different formats creates additional difficulties for an analyst. 2. Noise accumulation. Predictive models are build using multiple factors at the same time, and total accumulated mistakes create “noise”, which can conceal true influence of some factors. 3. Spurious correlation. Due to huge sizes of the datasets and multiple variables being analyzed, a false correlation may be detected. 4. Incidental endogeneity. It is a threat of breaking one of the traditional assumptions of the regression analysis – exogeneity, meaning that some of the predictive factors could be dependent on the residual term. 11

Application areas of predictive analytics vary from business related topics, such as retail, marketing and finance, to healthcare and environmental issues. Retailers use predictive analytics to forecast demand on particular product. Marketers use analytics create customers profiles, to determine the reaction of the public on new products, and to detect customer’s communities. Law enforcement agencies use it to predict the occurrence of crimes, healthcare systems employs predictive analytics to make diagnosis that is more precise, costume agencies use it for purposes of fraud detection. There are numerous possible applications for data analytics. Banks and other financial organizations also have much to gain from predictive analytics. Today, big data challenges both firms and individual traders, and those who are capable of rapidly extract relevant information and analyze it, will gain the competitive edge. As the report from SAP (2012) states: “…the profitability keeps falling in recent years, and organizations are now evolving towards smart trading based on big data analytics. Besides designing more complex computing model and system, how to make such large scale computation real time is still a very important problem that is needed to be considered seriously”. 1.3 Social Media and Business news Analytics There is a subset of Big Data, which refers to the big data derived from social media – social big data. Bello, O. (2016, 47) defines social big data as follows: “Those processes and methods that are designed to provide sensitive and relevant knowledge to any user or company from social media data sources when data sources can be characterized by their different formats and contents, their very large size, and the online or streamed generation of information.” Methods of processing social big data constitute social big data analytics, which is defined by Bello, O. (2016, 47) as follows: “Social big data analytic can be seen as the set of algorithms and methods used to extract relevant knowledge from social media data sources that could provide heterogeneous contents, with very large size, and constantly changing (stream or online data). This is inherently interdisciplinary and spans areas such as data mining, machine learning, statistics, graph mining, information retrieval, and natural language among others. This section provides a description of the basic methods and algorithms related to network analytics, community detection, text analysis, information diffusion, and information fusion, which are the areas currently used to analyze and process information from social-based sources.” Social big data may be of use not only for those companies, who trade in consumer good, but also for financial and for banking sector. 12

Asset’s prices are determined not by impartial machines, but by individuals who trade on the stock exchange. As any human being, they are not completely rational, their decisions are influenced by public’s mood and rumors. Advancement of analytical applications has made it possible for researches to include psychological factors in their predictive model. Tracing these factors is challenging, since they are hidden in the huge amount of unstructured data. One of these factors are customer’s sentiments and opinions about a company or a product. People’s expectations and opinions about a particular company or product are reflected in social media platforms, such as Twitter and Facebook. Models for stock prices prediction based on an analysis of public’s mood were build and tested in some academic articles, such as Bollen, Mao, and Zeng (2011), and Wu He (2015). Johan, B. (2011) has shown that even Dow Jones Industry Average index could be predicted by analyzing Twitter mood. First step of constricting predictive model based on information derived from social media is public sentiment’s extraction. There are various software tools for that purpose, including IBM Watson. Second step is data processing. It is done by assigning scores or dimensions to every observation. Scores could be “positive”, “neutral”, “negative”, or some other forms. After transforming initial unstructured data into structured scores, usual statistical methods could be applied. Using the same technique, one can build a prediction model based on the machine processing of the vast amount of business news articles. Such a model was build by Chowdhury (2014). Accuracy of forecasting with public’s sentiment models is varying from 70 to 80%. 1.4 Market of predictive analytics tools in financial sphere Market of financial analytics is a fast growing niche. It was estimated in Financial Analytics Market forecast conducted by Research and Markets (2014), that by 2018, total market value of financial analytics will reach the mark of 6.65 billion dollars. Such a rapid growth is driven by the impact of the big data on the operations of banks, audit firms and other types of financial organizations. Nowadays, researches point out the importance of predictive analytics for all organizations, for example Ventana Research (2016, 3) states: “ Organizations increasingly need to understand what’s happening right now and to be able to forecast what is likely to happen in both the near future and the long term.” As a mean to serve this need, Ventana Research (2016) sees predictive analytics. Currently, there are multiple providers of analytical tools on the market. Among them are such giants as IBM, Microsoft, Google, and Amazon. 13

According to Doug, H (2015), the leading position on the market of analytics platforms belongs to the IBM Corporation, KNIME, RapidMiner, and SAS. Such a giant as Microsoft is lagging behind, but in a past two years it has showed positive dynamics and now it is catching up with the leaders. One of the IBM’s products became particularly popular among researchers and dada scientists – Statistic Package for Social Sciences (SPSS). SPSS embeds vast arrange of statistical tools and provides customers with the ability to apply econometric modeling to their data. SPSS offers everything the analyst needs, but the main drawback of SPSS is the requirements to the user: only qualified specialist with expertise in statistic and econometric could use SPSS properly. Apart from SPSS, IBM offers another service, available through cloud – IBM Watson analytics. Watson analytics provides customers with natural predictive and visual analytics. It includes data storage, data processing, data analysis, and visualization. In addition, it can run social media analysis (twitter), helping to assess public’s sentiments towards any given event/company/product. Three key properties of IBM Watson analytics are as follows: 1. Complex arrange of services: unlike other analytical tools, that are supposed to solve particular types of business tasks, Watson analytics helps to refine data, evaluate its quality, analyze it, and create a report, thus rendering use of other tools unnecessary. 2. Predictive analytics: IBM Watson automatically determines the most relevant data, and reveals interconnections between variables. 3. Usage of natural language: IBM Watson allows users to ask questions in common English, thus making it possible for a person without knowledge of statistic science to operate with the data. Microsoft’s, Amazon’s, Google’s, and SAS’s predictive analytics represented by Azure Machine Learning, AWS Machine Learning, Google Predictive API, and SAS Visual Analytics respectively. In the essence, they are analogs of IBM Watson, all of them provide visualization, analytical, and predictive services, accessible through cloud. Important feature of all these three products is that they offer predefined analytical models for particular business need: banking, insurance, retail etc. Unlike IBM Watson Analytics, they provide customers with the ability to develop their own applications for very specific purposes. There are many other players at the advanced analytics market: Prognoz, Sap, Oracle and so on, but they occupy niche market. 14

1.5 Research gap. Influence of the big data and applications of predictive analytics in different spheres of business, healthcare and public safety have gained some attention in the past few years. However, the most attention gained marketing: analyzing customer’s feedback, detecting communities via social media, demographical profiling of customers. Big data and predictive analytics’ influence on financial and banking sectors has been noticed in academic circles. There are some academic papers, like Earley, E. (2015), Yoon, H. (2015), and Min, C. (2015) which addressing opportunities and challenges of big data analytics for auditing. Other studies, like Srivastava, U. (2015) analyze the application of big data analytics for banking sector, but they mostly cover customer profiling, risk management, and fraud detection issues. Smith (2015) and Bologa (2010) have discussed the influence of big data and big data analytics on the insurance sector. Kwan, M. (2014), and Ruta, D. (2014) brought the problem of applicability of big data analytics and predictive analytics for the purposes of increasing effectiveness of trading operations on the stock market to the attention of academics. However, their research only stated the opportunities and challenges of big data in trading. They did not run empirical check and did not compare analytical platforms, available on the market. Both information deficit and the abundance of information can make it hard for the trader to make a decision regarding his trading strategy. Profit of an individual trader, bank, or broker firm depends on how quickly and effective relevant information is extracted from the high volume datasets of unstructured and structured data. Rise of the big data creates a need to an effective and reliable methods and tools for processing vast amounts of market data. All of the before mentioned authors have identified possible implications of big data analytics for banking, audit, and insurance, but there is still a place for a research, which goal would be to find out how particular type of financial organization (bank, audit, insurance or trader), could achieve their business objectives using particular types of advanced analytical platforms. The goal of this research is to fill the research gap by assessing possible applications of predictive analytics for stock market forecasting. 1.6 Research methodology and organization of the study. In a course of this research, we will use quantitative methods to analyze and compare forecasting abilities of the leaders of the market of advanced analytical platforms: IBM Watson Analytics, SAS Analytics, KNIME, and RapidMiner. The comparative analysis will be based on the set of predefined KPI’s. 15

Using the KPI’s, we will assess the ability of these analytical platforms to execute business tasks of financial organizations. Based on the analysis, we will run a comparative analysis of the platforms and generate recommendations regarding which platform to use for the purposes of stock market forecasting. Special attention will be paid to one of the KPIs – forecasting accuracy. For the purposes of comparing chosen analytical platforms by this KPI, we will build predictive models in IBM Watson Analytics and Gretl statistical package. Firstly, we will build econometrical models for prediction of currencies exchange rates. For that purpose, we will build two types of models: ARIMA and factor regression, which use prices of the main export product of the country. Next financial asset’s price we will try to predict is blue chips of stock markets – IBM, Microsoft, P&G etc. As a theoretical base, we will use Capital Asset Pricing Model – CAPM. United States financial market is one of the most developed ones, therefore it’s reality is as close to the Effective Market Hypothesis (EMH) as it gets on real life markets. The last financial assets we will take into consideration are stock indexes. The importance of considering stock indexes is driven by the fact that they serve as a guideline for traders, analyst and investors, because they reflect overall situation on the market. It is the first phase of the empirical research, and it will be conducted using Gretl statistical package. Our next step will be the construction of predictive models for the same assets using the same datasets in all aforementioned analytical platforms. Apart from building alternative quantitative models using financial data, we will make use of social big data, by running twitter analysis with the help of IBM Watson analytics. Accuracy of forecasts will be assessed through two characteristics: Mean Absolute Percentage Errors and the potential profitability of applying such models. Potential profitability will be estimated during the simulation experiments. We will imitate real life trading using given models. We will set an investor’s behavior as follows: investor is profiting from the difference between prices of the same assets in two consequent time periods. If the model predicts that the price will go down, the investor buy the asset, with an intention to sell in the next period regardless of its actual price. If the model predicts depreciation of the asset, than it goes vice versa. The result of the research will be a comparative analysis of forecasting abilities of some the main analytical platforms available on the market. 16

1.7 Conclusion of Chapter I. The rise of the big data in the recent years has created challenges and opportunities for every type of business. In order to tackle this challenges and not to miss the opportunities it’s necessary to use predictive analytics techniques. Big data consist of vast amount of structured and unstructured information. It is characterized by the three V’s – volume, variety, and velocity. For different kinds of big data being analyzed there is different type of data analytics techniques. Data analytics consists of text analytics, audio analytics, social media analytics, and predictive analytics. Big data affects many spheres of business, including trading, as it allows processing and analyzing of immense amounts of data, thus making it possible for analyst to uncover interdependencies and patterns, which otherwise would have been ignored. Big data holds potential to increase effectiveness of trading deals on the stock market, therefore it is subject of interest for both individual traders and broker firms. Trader’s interest in the analytical platform is its capabilities to explore the data, to find out interrelationships and correlations between variables. These interdependencies and correlations within a dataset could be detected using traditional statistical methods. However, predictive analytics and conventional statistical methods are not completely similar, despite the fact, that predictive analytics and econometrics use the same mathematical and statistical toolkit. There is one fundamental difference between them: in order to build econometrical models, one should find theoretical grounds for it, formulate statistically verifiable hypothesis, and test it. This approach leads to a creation of an explanatory models, which describe factors that drives observable variable, however, this kind of models don’t have the best predictive accuracy. Predictive analytics, just like econometric modeling, uses statistic methods, but it differs in the research approach. Predictive analytics doesn’t need to test predefined hypothesis, instead of doing so, it explores interdependencies between observable variable and whole set of possible predictive factors. As a result, a predictive model is created, which however, may lack theoretical explanation and which could be more biased than explanatory one. Additionally, advancement of cloud computing made it possible to run social media and investor’s sentiment analysis. Finally, such characteristics of an analytical platform as text analysis and social media analysis is an object of interest for every financial organization (except for audit firms, since the applicability of social media to the audit isn’t confirmed), as the majority of information comes in an unstructured form. 17

There are many analytical platforms available on the market, we will take into account only top five of them, according to Gartner’s Magic Quadrant of advanced analytical platforms (2016). Most of them are available only through cloud (IBM Watson, Azure Machine Learning, SAS Visual analytics, Amazon Machine learning), however some platforms offers their services offline: KNIME, RapidMiner, which were recognized by Gartner as one of the market leaders. Market of advanced analytical platforms is one of the most dynamic. Comparison of Gartner’s Magic Quadrants from 2014 and 2015 reveals serious movements on the market. However, one player on the market attracts special attention – IBM, with its cloud-based analytical service called Watson Analytics. IBM has been the leader of the market for several years, and its service provides an easy way for a researcher to analyze and visualize huge amounts of data. Ability to simultaneously process big datasets holds the potential for stock market analysis. Nowadays, there are too many information on the market, coming from multiple sources, its impossible to assess all relevant information in a short time, and time is of the essence when it comes to forecasting a stock market. Chapter II. Research framework. 2.1 Research goals, KPIs, objectives, questions, and limitations. Purpose of this work is to provide potentially interested parties (trading firms), with the comparative analysis of predictive analytics tools and providers, in order to help them to make a decision regarding which product to use for a particular task. In a course of this research, we will analyze and compare main advanced analytical platforms that are available on the market. Each advanced analytical platform has its own characteristics that are identified by the Ventana Research (2016) as follows: 1. User roles and self-service: this characteristic reflects the ability of a platform to be used by different kind of users, with different data analysis capabilities and different requirements to analytics. 2. Information Optimization: it reflects the ability of an analytical platform to manage different kinds of data flows that are coming from different sources, and the ability to refine the data. 3. Range of analytical capabilities: it includes visualization capabilities, data exploration capabilities (uncovering of hidden patterns), and ability to detect particular events in the dataset. 4. Cloud and Mobile deployment. 18

5. Time to Value: the ability of a platform to perform the analysis and present the result in the shortest time possible. From these five KPI’s we will take two: User roles, and self-services, and Range of analytical Capabilities. We will break down them into sub criteria as follows: Visualization, Simplicity of Use, Predictive Analytics capabilities, Range of Econometric Modeling, Textual Analytics capabilities, and Social Analytics Capabilities. After evaluating analytical platforms using these KPI, we will analyze how well each of them addresses the needs of particular kind of financial organization. Then we will provide interested parties with the recommendations regarding which advanced analytical platform to use for each of the business objectives. In addition, we will look into how the ability of an advanced analytical platform (using IBM Watson Analytics) to suggest predictive models compares with standard theoretical approaches to stock price forecasting. Goal of this research is twofold. Firstly, it is to run a comparative analysis of main advanced analytical platforms. Secondly, it is to assess the ability of IBM Watson Analytics to suggest effective predictive models for stock price forecasting. Research questions of this work are as follows: 1. Which analytical platforms is a better fit for the purposes of stock price forecasting? 2. Does analytical platform (Using IBM Watson Analytics as example) suggest effective predictive models for stock forecasting, in comparison with standard theoretically based econometric models? Research objectives: 1. To evaluate analytical platforms (IBW Watson Analytics, SAS Analytics, KNIME, and RapidMiner), using KPIs mentioned before. 2. To make a comparative analysis of the analytical platforms. 3. To rank them based on their ability to make predictive models for stock market forecasting. 4. To construct and evaluate theoretically based econometric models for stock prices forecasting. 5. To construct econometric models for stock price forecasting using factors, suggested by IBM Watson Analytics Prediction function. 6. To compare the performance of theoretically based, and Watson Analytics suggested models. 19

Analytical platforms that will be taken into consideration: (IBW Watson Analytics, SAS Analytics, KNIME, and RapidMiner). This choice is justified by the Gartner’s Magic Quadrant of advanced analytical platforms (2016), which has identified them as the market leaders. Limitations: 1. Not all advanced analytical platforms available on the market are considered. 2. Ability of IBM Watson Analytics to suggest predictive models will be compared only with mostly common used econometric models: comparative analysis with all econometric possible econometric models is impossible, as there are too many of them, and new ones could always be generated. 3. Not all analytical capabilities of analytical platforms will be empirically tested. 4. Simulation of the potential profitability is made under the assumption that an investor have access to all necessary information and reacts on it instantly. 2.2 Methods of evaluation of advanced analytical platforms. Evaluation of the analytical platforms will be done using Analytical Hierarchy process. According to Abdullah (2013), AHP is conducted through seven steps: 1. Determination of the hierarchy of criteria and calculation of the normalized matrix. 2. Determination of criteria weights 3. Determination of the eigenvector. 4. Check of the consistency ration. 5. Comparison of the alternatives. 6. Calculation of the alternative’s scores 7. Ranking of alternatives. Hierarchy of criteria is determined by the relative importance of them for the goal (car purchasing, vendor choice etc.). In a result of a pairwise comparison of the criteria, a matrix n x n is created. Its elements reflect the relative value of different criteria to each other. For example, element aij indicates value of “i” criteria to “j” criteria. Next, a normalized matrix is defined. Element aii aii =1, and a ji =1/ aij . in this matrix is determined as the results of dividing the values, derived in the result of pairwise comparisons of row “i” relative to column “i”, by the sum of the pairwise comparisons in the “i” column. The criteria weight is determined as a mean of elements of normalized matrix: n n m ∑ μi =∑ ∑ aij (1) i=1 i=1 j=1 Eigenvector is determined as follows: 20

√ μi n w i= n ∑ μi (2) i=1 Consistency ratio is calculated as: CR= CI= CI (3) RI γ max −n (4) n−1 n γ max =∑ i A wi (5) n wi RI is a random index, which takes values depending on the number of elements (n). CR should me no more than 0.1. Scores of the alternatives are calculated using this equation: n Ascore=max ∑ aij w i (6) j=1 Based on this scores, final ranking is constructed. 2.3 Methods of comparing the forecasting accuracy of IBM Watson and statistical packages. Method of research: comparative analysis based on the results of generated by quantitative methods. For the purpose of comparing forecast accuracy of different tools, we will consequently create predictive models in statistical package and IBM Watson using the same dataset. In order to have the most tried and reliable econometric models for comparison, we will run forecasts of currencies exchange rates, forecasts of stock’s price dynamic, and forecasts of stock indexes. Econometric model building follows three steps: 1. Theory 2. Hypotheses Development 3. Test We will forecast currencies exchange rates using two approaches: ARIMA models, and linear regression models, which uses prices of the most exported commodities as an independent variable. Theoretical foundation of these models could be found in Meese, R., Rogoff (1983). And Rogoff, Rossi (2015). Additionally, we will build CAPM models for stocks of the biggest corporations, such as Google, Microsoft etc. We will use only USA stock market for building CAPM models, because 21

CAPM operates under the assumptions of EMH (Effective Market Hypothesis), therefore, CAPM doesn’t fit for developing stock markets. Forecasting with the IBM Watson Analytics differs from building econometrical models, the main difference is that it doesn’t require strong theoretical grounds in order to make a model – it analyzes the whole dataset and automatically suggests. Predictive analytics steps: 1. 2. 3. 4. Data Relationships Development Hypotheses Model Building Test Hypotheses 5. Model Validation As we can see, IBM Watson lacks theoretical grounds for model building, but best predictive models are not necessarily the best theoretically based as it is stated by Shmueli, G (2010). Comparison of the predictive models will be based on the two indicators: 1. Mean Absolute Percentage errors 2. Potential profitability Potential profitability will be estimated as profit, generated by the given model during the simulation. Simulation will be run in accordance with rules as follows: 1. If model predicts, that price of the asset will rise in the next period, an investor makes a decision to buy the asset. 2. If model predicts, that price of the asset will fall in the next period, an investor makes a decision to sell the asset. 3. If an investor bought the asset, he would sell it in the next period regardless of of its new price. 4. If an investor sold the asset, he would buy it back in the next period, regardless of its new price. At the end of the prechosen period, investors stops and calculates his/her returns, which will be used as an indicator of forecasting accuracy of the model. In order to have more reliable indicator of the forecasting accuracy, we will run a model, simulating real life trading. Rules of the model are simple, if it anticipates, that asset’s price will increase in the next period, than an investor takes the decision to buy the asset, with the intention to sell it afterwards. Depending of the actual change of the prices, such operations could bring profits or loses. 22

2.4 Method of currency exchange rate forecasting using Statistical Packages Since the publication of a highly cited article of Meese, R. (1983), it has become a sort of benchmark to compare all currency exchange rates models with the Random Walk models, which performs no worse than any other model. However, some more recent researches like Moosa, I. (2014), argue that unbeatable random walk is, in fact, an illusion. They argue, that random walk model seems superior only if it is evaluated it in terms of mean square error, absolute square error and rood mean square error, but if model is evaluated by its direction forecasting power and profitability, than random walk lose it to almost all other models. Random Walk is a type non-stationary time series, which is defined as follows: (7) X t = X t −1 +e t Where Xt is an observable variable, and e t is a pure random component. The difference between the random walk and auto regression AR (1) is that an effect of every random component is preserved forever. If the process begins with t=0, than: X t =X 0 +e 1 +… et In a more general case, there is a constant (8) B1 , which turns the process into a random walk with a trend: X t = X 0 +B1 t+e1 +… e t (9) Another nonstationary process is a time series with the determined trend: X t =B 1 + B2 t +e 1 (10) The main difference between this model and the random walk is that time series with determined trend has a tendency to return to the trend’s line, while random walk with trend doesn’t necessarily returns to the trend’s line. One more approach to currency exchange rate forecasting is a regression model, based on the prices of main export commodities of the given country. Such a model was tested by Ferraro, Dominico, F. (2015). They tried to forecast US. dollar - Canadian dollar, US. dollar –Australian dollar, US. dollar –Norwegian krone, US. dollar – South African rand, and US. dollar – Chilean peso, currency exchange rates based on the prices of oil, gold, and copper. The results revealed short-term relationships between prices of country’s main commodity price and its currency nominal exchange rate. However, the applicability of such a model is limited by countries, which have small number of main export commodities, meaning that most of developed countries currencies exchange rates couldn’t be predicted using this model. 23

As an approach to currency exchange rate forecasting, we will use ARMA and ARIMA auto regression models. ARMA model is defined as follows: X t =B 1 + B2 X t −1 +…+B p+1 X t −p +e t +a2 e t −1+…+aq+1 e t−q (11) Where X t – observable variable, Bt – coefficient, which determines the influence of the previous observations on the current one, at −¿ a coefficient, which determines 2.5 Methods of stock forecasting using Statistical packages Econometrical methods of financial market analysis have strong mathematical and statistical grounding. However, their applicability is limited due to assumptions, upon which econometrical models are based. Most of theoretical models of stock market forecasting require so-called Efficient Market Hypothesis. Efficient market hypothesis refers mostly to the information effectiveness of a market. Efficient market hypothesis implies that information is equally available to all participants of the market; they interpret it in a similar manner and instantly use it to adjust their strategy and operations. In addition, efficient market theory suggest that all players are rational, have similar goals and use similar strategies. Main characteristic of an efficient market is a result of the realization of all aforementioned assumptions. If a market is efficient, then prices of assets instantly, completely and correctly assimilate all available and relevant information, and reach equilibrium, thus making regular gain of abnormal incomes impossible. In the efficient market, it is considered that expected returns includes all systematic risks, and provide investors with acceptable returns, consistent with all other similar risk level assets. One of basis models, based on efficient market theory is Capital Assets Pricing Model (CAPM). Its main equation looks as follows: μi= R0 + βi ( μ M −R 0 ) Where μi−¿ expected return of any given asset; (11) R0 – risk free return, coefficient, reflecting the nature of the asset (riskier and more profitable assets have and less risky and less profitable ones have β i < 1); μ M βi – beta βi > 1; – average market return on assets. CAPM is based on the list of assumptions: 1. Investors evaluate assets using their expected returns and risks 2. Expected returns are stochastic 3. Risk is measured as dispersion of returns 24

4. Investors are trying to maximize their asset’s returns 5. Investors are risk aversive 6. Absence of a monopolistic influence on the market 7. Absence of taxes 8. Absence of transactional costs 9. Absence of unexpected inflation 10. Assets are infinitely divisible 11. No limitations on leasing and lending on risk free rate 12. All investors have similar planning horizon 13. All investors evaluate probability distribution of expected returns 14. Information is free and all investors have equal access to it. It is evident from the list that CAPM assumptions are unrealistic, as nearly the half of them contradict the reality of actual financial market. However, this model serve as a base from which other, more realistic models could be derived. It is done by loosening some of the aforementioned assumptions, thus making model more applicable for actual forecasting. Another class of econometric models is factor models. These models assume that expected return of an asset could be determined as a reaction to a change of some economic factors, such as GDP, inflation or oil prices. Factor model tries to consider main economic factors, influencing prices of assets. It implies that any two given stocks are correlated with each other only through common economic factors. Every factor, influencing expected return of a given asset, which is not in the model, considered unique; therefore it doesn’t correlate with unique factors of other assets. 2.6 Conclusion of Chapter 2. In a course of this research, we will evaluate the abilities of top analytical platforms (IBM Watson Analytics, SAS Analytics, KNIME, and RapidMiner), to serve the needs of banks, audit firms, insurance companies, and trades by assessing these platforms using the set of two main KPIs: User-Friendliness, and Range of Analytical capabilities. These main KPI’s are subdivided into six criteria: Visualization, Simplicity of Use, Predictive Analytics, Econometric Modeling, Textual Analytics, and Social Media Analytics. Then we will evaluate chosen analytical platforms using Analytical Hierarchy process and the set of KPI’s mentioned above. AHP will performed through a series of pairwise comparisons, which will determine the relatives weights of criteria and ranking of the alternatives (IBM Watson Analytics, SAS Analytics, KNIME, and RapidMiner). After 25

conducting AHP, we will determine the most appropriate analytical platform for the purposes of stock market forecasting. In case of statistical packages, we will used theoretically based econometric models, and in case of IBM Watson analytics we will let the platform to suggest optimal models by itself. This approach a potential problem: lack of the theoretical groundings. For a trader, it may appear be irrelevant, since he/she mostly cares about the accuracy of forecasts, however, without theoretical basis it is impossible to guarantee the stability of the model: it could have just happened that the factors, which affected the predicted variables, are spuriously correlated. In the research, we will build series of models. The first one will be standard random walk models for currency’s exchange rates. It will used for a comparison with other models, since they will make any sense only in case if they outperform the random wall. Another series of predictive models for currency’s exchange rates will be constructed using simple one-factor model that use price of the most exported commodity as a predictor. The dynamic of stock market will be analyzed by applying Capital Asset Pricing Model to the blue chips of the United States stock exchange: Microsoft, Apple, IBM, Bank of America, Walmart, and P&G. The US stock market was chosen because of the necessity of operating under the Effective Market Hypothesis, which more likely to be true in the developed market, rather than the emerging one. Final series of predictive models will be constructed in Gretl, but in this case, factors will be chosen based on the suggestions of IBM Watson Analytics, which automatically determines drivers of a given variable. Predictive accuracy of the forecasts generated by aforementioned models will be estimated by two characteristics: Mean Absolute Percentage Errors, and potential profitability. The latter characteristic will be assessed through the results of a trading simulation experiment during which we will imitate real-life trading using all of the models we have constructed. Chapter 3. Empirical estimation of analytical platforms. 3.1 Evaluation of the Analytical Platforms 3.1.1 Justification of the choice of analytical platforms taken for consideration According to the Gartner Magic Quadrant for advanced analytical platforms (2016), the market is divided into four categories: Leader, Challengers, Visionaries, and Niche players. This classification is based on their abilities to execute (performance metrics) and their completeness of vision, which could be interpreted as future perspectives. See figure 3. 26

There are several vendors and analytical products in each category. In this research we focus our attention on the leaders and visionaries. Among the market leaders are IBM, SAS, Dell, KNIME, and RapidMiner. Visionaries are represented by Microsoft, Alteryx, Alpine Data, and Predixion Software. Another research – Forrester Wave (2015), suggest a different picture, based on the current performance and strategy: IBM and SAS remain the leaders. However, KNIME, RapidMiner and Dell are removed from the leaders section and ranked as strong performers. See figure 4. According to both reports, IBM is the market leader: it shares its place with SAS, but according to Forrester Wave (2016), it has better perspective for future (higher strategy rank). KNIME and RapidMiner occupy similar positions in both rankings, also they offer similar approaches to analytics – both are available offline and provide clients with cost-benefit ratio, as it states Piatetsky (2016). So, we will chose for platforms for further analysis: IBM Wastson, SAS, KNIME, and RapidMiner. Dell is set aside, because it is noticeably behind other leaders. Figure 3. Gartner’s Magic Quadrant for Advanced Analytical platforms 2016 27

Figure 4. Forrester Wave 2015 3.1.2 Results of Evaluation of Analytical Platforms We will apply simple Analytical Hierarchy Process Method, using BPMSG AHP Online System (http://bpmsg.com/academic/ahp.php). Goal of AHP is to choose most appropriate analytical platform for stock price forecasting. According to Gartner’s Magic Quadrant for Advanced Analytics (09 February 2016, ID: G00275788) the key alternatives are IBM Watson, SAS, KNIME, and RapidMiner. We have two main criteria: User-friendliness and Range of Analytical Capabilities. These criteria could be broken down into sub-criteria as follows: 1. User-Friendliness: Visualization, and Simplicity of Use. 2. Range of Analytical Capabilities: Predictive Analytics, Econometric Modeling, Textual Analytics, and Social Media Analytics. The relative weights of these criteria were determined through a series of pairwise comparisons with each other. The comparisons were made as follows: 1. Range of Analytical Capabilities is more important than User-Friendliness. 2. Simplicity of use is more important than Visualization. 3. Predictive Analytics is equally important to Econometric Modeling; Predictive Analytics is more important than Textual Analytics and Social Media Analytics; Econometric Modeling is more important than Textual Analytics and Social 28

Media Analytics; Textual Analytics is equally important to Social Media Analytics. Description of these criteria is presented in the Table 1. And results of this pairwise comparison in BPMSG AHP Online System are presented in the Table 2. Table 1. Description of the criteria. Criteria Description Visualization Refers to quality of data and analysis visualization, provided by a platform. requirements of IT and statistical expertise. Simplicity of Use Refers to the requirements of IT and statistical expertise. Predictive Analytics The ability to suggest predictive factors. Econometric Modeling Range of the statistical and econometrical tools, which a platform provides with. Textual Analytics Reflects the range of textual analytics techniques, provided by a platform. Social Media Analytics Reflects mostly the range of social media and news sources, which a platform is capable of analyzing Table 2. Decision Hierarchy Level 0 Level 1 Level 2 Global Priorities Visualization 11.1 % User-Friendliness Simplicity if Use 22.2 % Predictive Analytics 22.2 % Analytical Econometric Platform Range of Analytical Modeling 22.2 % Capabilities Textual Analytics 11.1 % Social Media Analytics 11.1 % Our next step is to evaluate the alternatives using these criteria. The results of the evaluation are shown in the Table 3. Table 3. Evaluation of the Platforms (sources: Bloor Group, KMIME Documentation, SAS Product Documentation, and RapidMiner Documentation). Visualization Platform Priority SAS 39.5% IBM Watson 27.8% Comments Rank 1 2 This ranking is based on the how well Visualization is integrated into the analytical process, and what 29

KNIME 16.3% 3 16.3% Simplicity Platform Priority IBM Watson 39.5% SAS 27.8% KNIME 16.3% 3 RapidMiner RapidMiner 16.3% Rank 1 2 3 3 Predictive Analytics Platform Priority Rank IBM Watson 30.0% 1 SAS 30.0% 1 KNIME 20.0% 2 RapidMiner 20.0% IBM occupies the first place, because it doesn’t require deep statistical expertise from the user, and offers simple interface. The reason why both KNIME and RapidMiner hold 3rd rank, is that they require some level of statistic expertise, and have more complicated interface. Both IBM Watson and KNIME directly states the predictive analytics function. 2 Econometric Modeling Platform Priority Rank SAS 28.6% 1 KNIME 28.6% 1 RapidMiner 28.6% 1 IBM Watson 14.3% 2 Textual Analytics Platform Priority Rank IBM Watson 40.0% 1 SAS 20.0% 2 KNIME 20.0% 2 RapidMiner 20.0% 2 Social Media Analytics Platform Priority Rank IBM Watson 28.6% 1 SAS 28.6% 1 KNIME 28.6% 1 RapidMiner 14.3% 2 After evaluating the alternatives All platforms except for IBM Watson offers broad range of econometrical and statistical models, while IBM Watson has replaced It with Data Exploration and Predictive functions. All of the platforms offers textual analytics functions, but IBM Watson is the only one capable of answering questions, formulated in natural language. All platforms have social analytics functions; however, RapidMiner can analyze only twitter. (IBM Watson, SAS, KNIME, and RapidMiner) in the BPMSG AHP Online System, we have the results, which are presented in the Table 4. Table 4. Ranking of Analytical platforms. Priority Rank IBM Watson 29.4% 1 SAS 29.0 % 2 KNIME 21.6% 3 RapidMiner 20.0% 4 As we can see it in the Table 3. IBM and SAS are almost similar in the regard of suitability for stock price forecasting, according to the AHP method. Overall, the result is consistent with Gartner’s Magic Quadrant and Forrester Wave. However, IBM Watson has 30

scored a bit better, that is why we will use it for our further analysis, which is presented in the next chapter. 3.2 Evaluation of the forecasting accuracy of IBM Watson Analytics 3.2.1 Data description In the Table 5, we can see a description of the data we will use in the stock market forecasting experiments. Variables are classified into four categories: stock prices, prices of resources (gold, oil, and natural gas), values of the market indexes, and currency’s exchange rates. Observations cover the period from 01.30.2015 to 01.04.2016 We will use two types of software to run the predictive modeling: Gretl statistical package and IBM Watson Analytics. Type of models, which is marked as IBM+Gretl in the Table 7, was build in a steps as follows: after uploading the dataset to IBM Watson Analytics, the predictive function was applied. It has suggested predicting factors for each target variable (stocks and currency’s exchange rates), after that, simple two-factor regression models were build in the SPSS, using suggested by the IBM Watson Analytics predictive factors as independent variables. The random walk models are basically just ARIMA (0,1,0) models. They will be used just as a basis for comparison. Table 5. Data description (Source: Finam)XSo ftware Model Random Walk Models Gretl One-Factor models CAPM Variables Number of observations ERO/USD USD/CAD USD/YEN USD/ZAR USD/NOR USD/CNY USD/RUB USD/NOR USD/ZAR USD/RUB BRENT Gold S&P 500 BAC IBM MSFT P&G Walmart Apple 429 426 428 426 425 393 426 363 286 31

IBM+Gretl Two-Factor models S&P 500 DJI RTS Nikkei CSI FTSE Shanghai NASDAQ Gold Natural Gas Brent ERO/USD USD/CAD USD/YEN USD/ZAR USD/NOR USD/CNY USD/RUB Exxon Mobil Chevron BAC IBM P&G Walmart Apple 225 One-factor models are predicting the currency’s exchange rates based on the prices of the most exported commodities (oil, and gold). CAPMs predict the prices of the stock. It was build using week prices of the “blue chips” of the US stock market. Role of the average market indicator was played by the S&P 500 index. Interest rate of the 4 week reassure bills was used as the risk free rate (Rfr=2%). Return on assets is calculated as the difference between stock’s price in moment t and stock’s price in moment t-1, divided by the stock price in the moment t-1: R= Pt −P t −1 (12) P t−1 Specifications of the models are shown in the Appendix 1 and Appendix 2. 3.2.2 Forecasting stock prices with theoretically based models. 3.2.2.1Results of the Random walk models for currencies. Random walk model is the basis for comparison for any other forecasting model, as any predictive model makes sense only if it beats the random walk. Using Gretl statistical and econometrical package, we have built ARIMA(0,1,0) time series models, which are equivalent to 32

the simple Random walk. In the Table 6, we can see the error metrics for the random walk model for currencies exchange rates. Table 6. Percentage errors of Random Walk models Model EUR/USD USD/CAD USD/NOK USD/RUB USD/ZAR USD/CNY USD/JPY As we can see in the Table 2, MPE MAPE 0.004209 0.004209 -0.0011137 0.056509 -0.077933 0,077933 -0.27649 0,27649 8.1111 8.1111 -0.0089793 0.0089793 -0.032764 0.032764 random walk models have produced quite small mean percentage errors and mean absolute percentage errors, with the exception for USD to South African Rand Exchange rate (ZAR). It might give an impression that random walk performs greatly, however, as it is supported by Elliot, G (2013), for the purposes of profiting from the differences in the exchange rates, it’s more important to foresee the direction of change, rather than to give more accurate estimation. Low percentage error in Random Walk case could be caused by the fact that the forecasted value differs from the previous observation only by small random value. Our next step is to estimate potential profitability of trading main currencies using random walk model. For that purpose, we have run the simulation test in Excel 2013, using rules as follows: if the investor expects appreciation of the asset, then he buys it, and vice versa. The results are shown in the Table 7. We have used 30 last forecasted values of each currency’s exchange rates, for an imitation of real life trading. Table 7. Results of the simulation of Random Walk Trading simulation (Random Walk) Model Profitability EUR/USD 0,39% USD/CAD -6,20% USD/NOK -1,38% USD/RUB -17,45% USD/ZAR -0,46% USD/CNY 0,96% USD/JPY 1,64% As expected, results of the simulation reveal that Random walk model is absolutely unfit for trading, in 4 out of 7 cases, the profitability is negative, especially in case of Ruble, which 33

has shown over -17% loses. Even positive examples have very low profitability. The average return is -3.2%, and if it were real life trading, than the loses would be even bigger, as there are transaction costs and time lags. Thus, it is safe to conclude that random walk model is completely unfit for real life application. 3.2.2.2 Currency’s exchange rates forecasting using factor models. Table 8 presents the description of factor models. In accordance with Dominico, F. (2015) we have built predictive model for currency’s exchange rate forecasting using prices of mostly traded commodities as predictors. Models were built in Gretl econometrical package using “ordinary least square” option. Table 8. Description of factor models for currencies . Model Model’s Parameters Coefficient Sig, Model's Statistic R-squared MPE MAPE 0,79542 -3,6611 3,6611 USD/NOK const 10,1061 <0,0001 Brent −0,039167 <0,0001 USD/RUB const 83,0842 <0,0001 Brent −0,48308 <0,0001 USD/ZAR const 26,4943 <0,0001 Gold −0,011423 0,0003 USD/CAD const 1,4658 <0,0001 −0,0041481 Brent <0,0001 2 All factors are statistically significant 0,770869 4,0127 4,0127 0,169913 14,503 14,503 0,771673 -0,24727 4,19840 and they have expected influence on every currency (the higher the price of the commodity, the lower USD exchange rate). However, these models demonstrate bigger mean percentage errors than the random walk. In that sense, they don’t beat the random walk. Three out of four models have high R-squared (>0.7), which implies good explanatory power of models. The only exception is USD/ZAR model, which has very low R-squared (=0.169) and the highest Mean Absolute Percentage Error (14%). This result leads us to the thoughts that, gold isn’t the main export product in South Africa anymore. Our next step is to estimate potential profitability of trading main currencies using simple one factor regression. For that purpose, we have run the simulation test in Excel 2013. We have used 30 last forecasted values of each currency’s exchange rates, for an imitation of real life trading. 34

As it is shown in the Table 9, trading with factor models brings way higher returns, that just random walk, because factor models manage to generate more accurate predictions of the direction of price’s change. Average return for this model is 26%. Table 9. Results of the simulation of the factor models. Trading simulation (Factor regression) Model Factors Profitability USD/CAD Brent 0,2151143 USD/RUB Brent 0,3132015 USD/ZAR Gold 0,2284068 USD/NOK Brent 0,2875645 3.2.2.3 Stock forecasting using CAPM model. Using “ordinary least square” function in Gretl statistical package, we have built CAPM for every of stocks as follows: Apple, IBM, Microsoft, Procter & Gamble, Walmart, and Bank of America. As a factor we have used the risk premium: RP=( μM −R 0 ) (13) Where μM is return on S&P index, and R 0 is four weeks treasure bill interest rate. As we can see it in the Table 10, CAPM model produce quite poor results both in terms of explanatory power (low R-squared) and accuracy of forecasts, sometimes mean percentage errors exceed 100% (Average MAPE = 177%), meaning that the forecasts is radically different with the reality. Despite the fact that in all cases, risk premium as a factor was significant, and Rsquared is tolerable (except for Walmart case), the models appear to be unfit for the actual forecasting. Because of huge deviations of forecasted values from the actual ones. Table 10. Description of CAPM for stocks. Model Bank Of America const RP Microsoft const RP Apple const SP Walmart const Model Parameters Coefficient Sig. 0.000181202 1.32574 Model's Statistic R-squared MPE MAPE 0.549330 574.13 574.13 <0.0001 <0.0001 0.464251 0.00130152 1.13318 0.0833 <0.0001 −0.00068693 1.14488 <0.0001 <0.0001 0.214836 −0.0015747 202.74 258.91 59.562 59.562 0.0246 35

RP 0.592545 <0.0001 IBM const RP 0.453587 −0.000336033 0.919689 72.152 72.152 <0.0001 <0.0001 P&G 0.459370 66.181 66.181 const −0.000790775 0.0787 RP 0.671332 <0.0001 Our next step is to estimate potential profitability of trading blue chips stocks using CAPM model. For that purpose, we have run the simulation test in Excel 2013. We have used 30 last forecasted values of each currency’s exchange rates, for an imitation of real life trading. Results of the trading simulation (Table 13) confirms the point that CAPM is unfit for stock market forecasting. CAPM has generated significant potential outcome only in 2 out of 6 cases, in two cases, the results were negative, and the last two have demonstrated negligible profits, which would not even cover transactional costs. Average return is 5%, which demonstrates that despite huge deviations of forecasted values from actual ones, in some cases CAPM still correctly predict the direction of change. Table 11. Results of the CAPM simulation. Model Profitability BAC 0,1774171 IBM 0,19201815 MSFT 0,03278492 P&G -0,0342963 Walmart 0,01295297 Apple -0,0838422 3.2.3 Forecasting stock market using IBM Watson analytics. 3.2.3.1 Models for stock forecasting. We have used free version of IBM Watson analytics to conduct our experiment. After uploading our dataset consisting of 26 variables, IBM Watson Predict option has automatically processed and analyzed uploaded data. The result is a set of suggested predictive factors that drive any given variable. Based on the predictive power of the model, estimated by Watson Analytics, we have chosen the most promising ones. Forecasting of stock prices and currencies exchange rates using IBM Watson will be done using IBM Watson analytics “Predict” function in two steps: 1. Choosing factors, which IBM Watson Analytics Suggest as the best predictors 2. Building two factor regression using Ordinary Least Square method in Gretl statistical package 36

In the Table 12. We can see which variables were chosen as predictors, and which were chosen as targets. Results of applying the IBM Watson analytics predict function are shown in the Appendix 4. Table 12. Targets Prices of stock 12. Exxon Mobil 13. Chevron 14. BAC 15. IBM 16. P&G 17. Walmart 18. Apple Input Stock Indices & resource prices 1. S&P 500 2. DJI 3. RTS 4. Nikkei 5. CSI 6. FTSE 7. Shanghai 8. NASDAQ 9. Gold 10. Natural Gas 11. Brent Using suggested drivers of predicted values, we have built regression models in Gretl statistical package for each of the observed currency’s exchange rate. The results are presented in the Table 13. Table 13. Description of models built based on IBM Watson. Model Exxon Mobil 1 const Gold Futsee Exxon Mobil 2 const DJI Gold Exxon Mobil 3 const SP 500 Gold Model Parameters Coefficient Sig. −4.57266 0.00725082 0.0339528 −46.8102 0.0405594 0.00467673 −53.5047 0.0398747 0.0463417 IBM 1 const Brent Coefficient 0.827575 -0.05769 1.8791 85.2 0.762696 -0.079892 2.2036 83.6 <0.0001 <0.0001 <0.0001 <0.0001 <0.0001 <0.0001 Sig. Model's Statistic R-square MPE MAPE 0.874521 78.6593 1.19685 Predictive Power (%) 87.4 0.0226 <0.0001 <0.0001 Model Parameters Model Model's Statistic R-square MPE MAPE 0.712977 -0.10002 2.3478 11.277 11.277 Predictive Power (%) 93,00 <0.0001 <0.0001 37

NKK225 IBM 2 const NASDAQ100 USDZAR P&G const USDJPY Brent Bank of America const Natural Gas NASDAQ100 Apple 1 const DJI USDCNY Apple 2 const Brent NASDAQ100 Walmart 1 const USDZAR NKK225 Walmart 2 const NKK225 Footse100 Walmart 3 const Brent USDJPY Chevron 1 const Gold Footse100 Model Chevron 2 const NASDAQ100 USDZAR Coke 1 const 0.000636722 241.98 0.00448579 −8.25771 151.348 −0.683958 0.197202 0.0640 0.817450 7,5400 7,5400 92,30 0.305893 -0.19997 3.199 82.4 -0.20033 3.493 0.928082 -0.066927 2.0298 93,7 0.934771 1,2012 1,2012 96,30 0.627249 -1.5062 1,5062 94,50 0.784290 -1.4992 1,4992 93,30 0.601377 -1.7314 1,7314 91,30 0.754404 -0.29461 4.295 89.3 <0.0001 0.0916 <0.0001 <0.0001 <0.0001 <0.0001 0.808380 −19.479 2.11465 0.00683298 <0.0001 <0.0001 <0.0001 390.097 0.00533233 −57.7748 <0.0001 <0.0001 <0.0001 −9.95716 0.879771 0.0186525 0.0590 <0.0001 <0.0001 191.434 −5.50512 −0.00250334 <0.0001 <0.0001 <0.0001 −33.1095 −0.00304983 0.0248156 <0.0001 <0.0001 <0.0001 151.821 0.695071 −0.971299 <0.0001 <0.0001 <0.0001 −76.5478 0.0393618 0.0192772 <0.0001 <0.0001 <0.0001 Model Parameters Coefficient Sig. 119.11 0.0063668 −3.97567 Model's Statistic R-square MPE MAPE 0.425434 -0.6588 6.4729 Predictive Power (%) 88.7 <0.0001 0.0502 <0.0001 0.607398 43.2875 90.7 -0.0641 2.0372 79.9 <0.0001 38

Gold 0.00589016 Natural Gas −3.42148 Coke 2 const −18.4188 USDCNY 7.84844 Gold 0.00890342 Analyzing the results, we <0.0001 <0.0001 0.407077 7,5483 7,5483 82,3 0.0002 <0.0001 <0.0001 can see that three out of thirteen models (IBM 1, IBM 2, and Chevron 2) turned out to be statistically insignificant. That strange result could be explained by the fact that some potentially important predictors were not included in the uploaded dataset. IBM Watson just didn’t have enough data to generate good models for these stocks. R- squared is high or at least tolerable in all cases with the exception for P&G. Additionally, there are two models with a borderline explanatory power – Coke 2 and Chevron 2, R-squared equals 0.407 and 0.425 respectively. Mean percentage errors are quite low, but still they are higher than that of a random walk model. As a next step, we have estimated potential profitability of trading stocks using regression models, with factors suggested by IBM Watson analytics. For that purpose, we have run the simulation test in Excel 2013. The results are shown in the Table 14. We have used 30 last forecasted values of each currency’s exchange rates, for an imitation of real life trading. Table 14. Results of the Simulation of IBM predictive models. Trading simulation (Factor regression) Model Factors DJI Apple 1 USDCNY Brent Apple 2 Nasdaq 100 Gold Exxon Mobil 1 Futsee 100 Gold Exxon Mobil 2 DJI SP 500 Exxon Mobil 3 Gold Brent IBM 1 NKK225 Trading simulation (Factor regression) Model Factors NASDAQ100 IBM 2 USDZAR USD/JPY PG Brent Natural Gas Bank of America NASDAQ100 Futsee 100 Chevron 1 Gold Chevron 2 NASDAQ100 Profitability 0,4529254 0,426352 0,0992644 0,4689172 0,4479888 0,1694079 Profitability 0,1816904 -0,0611271 0,3798725 0,1907649 0,5317324 39

Walmart 1 Walmart 2 Walmart 3 Coke 1 Coke 2 USDZAR USDZAR NKK 225 NKK 225 Futsee 100 Brent USD/JPY Gold Natural Gas Gold USD/CNY 0,0200946 0,2398046 -0,0270778 0,0003398 -0,1652693 We have ambivalent results, on one hand; some of the models have demonstrated superior results during the simulation (Apple 1, Apple 2, Exxon Mobil 2, Exxon Mobil 3, and Bank of America), but on the other hand, three models have demonstrated negative result (P&G, Walmart 3, and Coke 2), and one has shown negligibly small profitability (Walmart 1). The lowest results were demonstrated by those models, which turned out to be insignificant (IBM 1, IBM 2). As it was mentioned before, the reason for these results could be absence of some important factors in the dataset. Overall, IBM Watson generated models have shown results that exceed any other in terms of potential profitability. Average return is 20%, which is way better than that of CAPM. However, there is a problem of separating profitable models from unprofitable ones, and the stability of the desirable performance over the time is still in question. 3.2.3.2 Models for currency’s exchange rate forecasting. We have used free version of IBM Watson analytics to conduct our experiment. After uploading our dataset consisting of 26 variables, IBM Watson Predict option has automatically processed and analyzed uploaded data. The result is a set of suggested predictive factors that drive any given variable. Based on the predictive power of the model, estimated by Watson Analytics, we have chosen the most promising ones. Forecasting of stock prices and currencies exchange rates using IBM Watson will be done using IBM Watson analytics “Predict” function in two steps: 3. Choosing factors, which IBM Watson Analytics Suggest as the best predictors 4. Building two factor regression using Ordinary Least Square method in Gretl statistical package In the Table 15. We can see which variables were chosen as predictors, and which were chosen as targets. Results of applying the IBM Watson analytics predict function are shown in the Appendix 3. Table 15. IBM Watson for currencies (source of data: Finam). 40

Targets Currencies 1. USD/CAD 2. USD/YEN 3. USD/ZAR 4. USD/NOR 5. USD/CNY 6. USD/RUB 7. ERO/USD Input Prices of stock 12. Exxon Mobil 13. Chevron 14. BAC 15. IBM 16. P&G 17. Walmart 18. Apple Stock Indices & resource prices 1. S&P 500 2. DJI 3. RTS 4. Nikkei 5. CSI 6. FTSE 7. Shanghai 8. NASDAQ 9. Gold 10. Natural Gas 11. Brent Example: choosing target variables and input variables IBM Watson Analytics displays results, as it shown on the Figure 5. Figure 5. Screenshot of Watson Analytics Predictive function results. Colored circles represent combination of two predictive factors (Stock Indices, Stock Prices, or Prices of the resources). The closes a circle is to the core, the higher is the predictive power. Using suggested drivers of predicted values, we have built two-factor regression models in Gretl statistical package for each of the observed stock. Using suggested drivers of predicted values, we have built regression models in Gretl statistical package for each of the observed currency’s exchange rate. The results are presented in the Table 16. Table 16. Description of currency’s exchange rate models. Model Model Parameters Coefficient Sig, Model’s Statistic R-squared MPE MAPE Predictive Power (%) 41

EUR/USD 1 const Gold PG USD/CNY 1 const Brent Shanghai USD/CNY 2 const Brent NKK225 USD/JPY 1 const BankAmerica Gold USD/JPY 2 const Gold NKK225 USD/NOK 1 const Gold Natural Gas USD/NOK 2 const Natural Gas Brent USD/ZAR const Natural Gas Brent USD/RUB const Brent Shanghai 1,02361 0,000299935 −0,00339848 <0,0001 <0,0001 <0,0001 6,94919 −0,0110162 −1,60684e-05 <0,0001 <0,0001 0,0335 7,22139 −0,00990401 −2,03678e-05 103,254 1,73662 −0,00945173 0,4096 2,8818 2,8818 63,40 0,8981 0,0547 0,0547 96,00 0,9198 -0,4768 0,4768 95,70 0,8242 -3,1761 3,1761 93,50 0,8642 -1,8067 1,8067 92,50 0,8408 -0,9656 0,9656 90,90 0,8196 -3,5709 3,5709 93,30 0,9076 -3,6134 3,6134 95,60 0,9182 -5,8449 5,8449 94,50 <0,0001 <0,0001 <0,0001 <0,0001 <0,0001 <0,0001 109,078 −0,020616 0,00184703 <0,0001 <0,0001 <0,0001 14,2993 −0,00358985 −0,776106 <0,0001 <0,0001 <0,0001 9,92649 −0,0331292 −0,0323285 <0,0001 0,0515 <0,0001 21,3444 −0,977925 −0,108871 <0,0001 <0,0001 <0,0001 104,251 −0,564791 −0,00349858 <0,0001 <0,0001 <0,0001 As we can see it in the Table 16, all of the models are statistically significant and have high values of R – squared, with the exception for the Euro to USD exchange rate model. In terms of percentage errors, models still are not capable of beating the Random Walk. As a next step, we have estimated potential profitability of trading currencies using regression models, with factors suggested by IBM Watson analytics. For that purpose, we have run the simulation test in Excel 2013. We have used 30 last forecasted values of each currency’s exchange rates, for an imitation of real life trading. 42

Results of the simulation tests are shown in the Table 17. In all cases except for Euro to USD, models were able to produce positive results, but the profitability is much lower than that of the stock predicting models (10% vs. 26%), this result is quite surprising. It once again raises question of stability of performance of econometrical models. Table 17. Results of the simulation for currencies Trading simulation Model Factors EUR/USD USD/CNY 1 USD/CNY 2 USD/JPY 1 USD/JPY 2 USD/NOK 1 USD/NOK 2 USD/ZAR Gold PG Brent Shanghai Brent NKK225 BankAmer ica Gold Gold NKK225 Gold Natural Gas Natural Gas Brent Natural Gas Brent Brent Profitabil ity 0,068763 67 0,013644 8 0,005544 67 0,132383 44 0,120075 05 0,016063 9 0,086062 94 0,125849 94 0,399698 93 3.2.3.3 Analysis of the results of stock price forecasting. USD/RUB Shanghai We have built a series of predictive models for stock price forecasting and currency exchange rate forecasting. First series were based on the Random Walk model. It was chosen as basis for comparison with other models, as it necessary for any predictive model to outperform random model in order to make at least some sense. Random walk models have shown unbeatably small deviations of forecasted values from the actual ones, but the random walk model fails to correctly predict the direction of change, therefore it is completely unfit for the purposes of trading. Another type of currency’s exchange rate forecasting model we employed is one factor regression, which uses price of the most exported commodity as a predictor. In terms of deviations of forecasts from actual values, they failed to beat the Random Walk, but in terms of potential profitability, as it was demonstrated by 43

the simulation, they easily outperformed the Random Walk, by demonstrating returns on the level of 20-30%. Next models we built are CAPM models for the “blue chips” with the index S&P 500 as an average market asset. CAPM has shown poor results in terms of both forecasting accuracy and potential profitability. Its deviation from actual values sometimes exceeded 100%, and only one model has shown substantial returns during the simulation. Series of stock predictive models based on the suggestions of IBM Watson Analytics have demonstrated results, which are superior to all other models. In terms of forecasting accuracy, they beat all models except for the Random Walk. Additionally, the simulation has demonstrated high returns for most of the suggested models, with the exception for four models with negative and unsubstantial returns. Results of currency’s exchange rate forecasting using IBM Watson were worse than that of a simple one-factor regression models, it still beats the Random Walk in potential profitability. It raises the question of spurious correlation between the variables. Overall, IBM Watson Analytics is capable of suggesting effective predictive models. However, it doesn’t provide users with detailed description of the nature of the interdependencies between the variables. It requires further analysis in order to compute actual forecast of the variables in question. 3.3 Conclusion of the Chapter 3. In the Chapter 3, we have identified four analytical platforms of interest, based on the Gartner’s Magic Quadrant for Advanced Analytical Platforms 2016 and Forrester Wave 2015: IBM Watson Analytics, SAS analytics, KNIME, and RapidMiner. The main factor, which has determined such choice, is that they were identified as market leaders, strong performers, and visionaries, with the biggest potential for growth. We have evaluated the analytical platforms using Analytical Hierarchy process with a set of six KPI’s: Visualization, Simplicity of Use, Predictive Analytics Capabilities, Econometric Modeling capabilities, Textual analytics capabilities, and Social Media analytics Capabilities. The results has shown the most preferable analytical platforms for stock price forecasting are IBM Watson and SAS analytics. However, IBM scored a bit better, so we chose it as an analytical platform of choice for stock market forecasting. Then we have dwelled deeper into how IBM Watson Analytics could be combined with statistical packages. For that purpose, we have built a set of models: theoretically based and those suggested by Watson Analytics. Series of stock predictive models based on the suggestions of IBM Watson Analytics have demonstrated results, which are superior to all other models. In 44

terms of forecasting accuracy, they beat all models except for the Random Walk. Additionally, the simulation has demonstrated high returns for most of the suggested models, with the exception for four models with negative and unsubstantial returns. Results of currency’s exchange rate forecasting using IBM Watson were worse than that of a simple one-factor regression models, it still beats the Random Walk in potential profitability. It raises the question of spurious correlation between the variables. Overall, IBM Watson Analytics is capable of suggesting effective predictive models. However, it doesn’t provide users with detailed description of the nature of the interdependencies between the variables. It requires further analysis in order to compute actual forecast of the variables in question. Final Conclusions Discussion of the findings. In the course of this research, we have completed the set of objectives, stated in the Research Framework Chapter. First of all, we have evaluated four analytical platforms of interest, based on the Gartner’s Magic Quadrant for Advanced Analytical Platforms 2016 and Forrester Wave 2015: IBM Watson Analytics, SAS analytics, KNIME, and RapidMiner. The main factor, which has determined such choice, is that they were identified as market leaders, strong performers, and visionaries, with the biggest potential for growth. Our next objectives were the evaluation and comparison of the analytical platforms based on their ability to generate predictive models for stock price forecasting. For the purposes of the evaluation, we have used a set of six KPI’s: Visualization, Simplicity of Use, Predictive Analytics Capabilities, Econometric Modeling capabilities, Textual analytics capabilities, and Social Media analytics Capabilities. The result of applying Analytical Hierarchy method has demonstrated that IBM Watson and SAS Analytics are the most appropriate tools, when it comes to forecasting stock market. The whole ranking is shown in the Table 18. Table 18 . Ranking of Analytical platforms. Analytical platform IBM Watson SAS KNIME RapidMiner Priority 29.4% 29.0 % 21.6% 20.0% Ran k 1 2 3 4 45

IBM Watson Analytics has won SAS analytics only by a hair. IBM Watson beats SAS at simplicity of use, but SAS wins when it comes to the range of econometrical and statistical tools, which it offers to users. The ability to suggest predictive factors, without preliminary analysis, is what distinguish IBW Watson and SAS from the others. They are superior in their abilities to conduct Predictive Analytics process, while other platforms require statistical expertise in order to use them to full extend. Our final objectives were to construct, evaluate and compare the results of theoretically based econometric predictive models, and IBM Watson Analytics suggested models. The results has shown that in terms of deviations of forecasts from the actual values of observed variables (measured in terms of Mean Absolute Percentage Errors), the Random Walk is unbeatable. However, when it comes to the potential profitability of the models (assessed trough trading simulation), theoretically based models has shown worse results, that IBM Watson Analytics suggested models, with the exception of the models, based on the prices of most exported commodities. This result could be explained by the fact, that IBM Watson Analytics didn’t specify the nature of interdependencies between the variables, meaning that further analysis is required in order to determine the exact econometric equation. Overall, the effectiveness of IBM Watson Analytics as an effective tool for predictive models suggestion was confirmed. To sum up, we provide direct answers to the Research questions, as it is shown in the Table 20. Table 20. Research Questions and answers Research question Answer Which analytical platforms is a better fit for the purposes of stock market forecasting? IBW Watson Analytics and SAS. Does IBM Watson Analytics suggest effective predictive models for stock forecasting, in comparison with standard theoretically based econometric models? Yes, IBM Watson Analytics suggest effective predictive models, however, further analysis is required in order to build the most effective predictive model. Theoretical implications. 1. Using the theoretical part of this work, similar researches of niche analytical platforms (according to Gartner’s Magic quadrant of advanced analytical platforms), such as Prognoz, Accenture, Fico, Megaputer, and Levastorm could be conducted. 2. The research provides a ground for further studies of how different analytical platforms and analytical software tools could be combined in order to construct predictive models. 3. The research can serve as a base for further studies of how big data challenges in financial sector could be tackled using analytical platforms. 46

4. There are some collateral theoretical results: the theory that currency’s exchange rate could be effectively predicted using the price of the mostly exported commodities was confirmed, however, is models have limited applicability, since they could predict exchange rates only for those currencies which are strongly connected to one particular commodity. In other words, it applies only to resource exporting economies. 5. The inability of CAPM to adequately predict stock prices even on the developed stock market was confirmed, therefore the Effective Market Hypothesis is not met on the US stock market. 6. The research has both confirmed and questioned the unbeatable random walk: in terms of the deviation measures, the random walk remains unbeatable, but from the perspective of forecasting the direction of change, it is outperformed by both theoretically based models, and by those that were suggested by IBM Watson Analytics. Managerial implications. 1. This research provides interested parties (traders) with the recommendations regarding which analytical platforms to use for the purposes of stock price forecasting. 2. The research provides individual traders with tight budget constraints with the no costs combination of analytical platforms (IBM Watson Analytics as a guide, and Statistical Package (Gretl) for the construction of the final model). This combination could prove to be quite effective, since IBM Watson Analytics is the only tool which is capable of suggesting predictive models without preliminary theoretical work. 3. The study has identified the analytical functions, which analytical platform should be able to perform in order to address the business tasks of the financial organizations. 4. The study provides with the criteria, using which analytical platforms can be chosen. 5. The study has contributed to the analysis of the market of financial analytics. Limitations. 1. Analytical Hierarchy Process imbeds some level of subjectivity: pairwise comparison of the criteria and alternatives could vary depending on the expert. 2. Only four out of many three analytical platforms were chosen. 3. This study was conducted with the use of open source data gathered from the Finam website. Access to the more possible variables harness the possibility for Watson Analytics to generate better predictive models. 4. All predictive models were estimated under the assumption that investor has real time access to all needed information and can react instantly, in accordance with chosen model. 47

5. Finally, our simulations were run under the assumption that an investor has instant access to all information, needed for the model building, and that an investor can strike deals instantly, before the market reacts on the changes. List of references 1. (2012, June 26). Small and midsize companies look to make big gains with “big data,” according to recent poll conducted on behalf of SAP. 2. Abdullah, L., J. Sunadia, and T. Imran. 2013. Ranking of human capital indicators using analytical hierarchy process. Paper presented on Evaluation of Learning for Performance Improvement International Conference Malaysia, 25 – 26 February, 2013 3. Antweiler, W. and Frank, M.Z. 2004. Is all that talk just noise? The information content of internet stock message boards. Journal of Finance 59 (3): 1259-1294. 4. Bologa, A., R., Bologa, and A., Florea. 2010. Big Data and Specific Analysis Methods for Insurance Fraud Detection. Database Systems Journal 4 (4): 30-39 5. Cao, M., R., Chychyla, and Stewart, T. 2015. Big Data Analytics in Financial Statement Audit. Accounting Horizons 29 (2): 423-429 6. Chung, W. 2014. BizPro: Extracting and categorizing business intelligence factors from textual news articles. International Journal of Information Management 34(2): 272–284. 7. Cukier K., 2013. The Economist, Data, data everywhere: A special report on managing information. February 25. Retrieved from http://www.economist.com/node/15557443 8. Curthberston, K. 1996 Quantitative Financial Economics. New York. John Willey & Sons Inc. 9. Domenico, F., Kenneth R., and Barbara, R. 2015. Can oil prices forecast exchange rates? An empirical analysis of the relationship between commodity prices and exchange rates. Journal of International Money and Finance 54: 116-141 10. Doug, H. Gartner Advanced Analytics Quadrant 2015: Gainers, Losers 11. Earley. E. 2015. Data analytics in auditing: Opportunities and challenges. Business Horizons 58: 493—500 12. Elliot, G., and A. Timmermann. 2013. Handbook of Economic Forecasting. Elsevier Science and Technology Books, Inc. 13. Fan, J., Han, F., and Liu, H. (2014). Challenges of big data analysis. National Science Review 1 (2): 293–314. 14. Finam. http://www.finam.ru/analysis/quotes/?0=&t=8315698 15. Financial Analytics Market - Worldwide Market Forecasts (2013 - 2018). Research and Markets (http://www.researchandmarkets.com/research/6xj66l/financial) 48

16. Gandomi A., and Murtaza H. 2015. Beyond the hype: Big data concepts, methods, and analytics. International Journal of Information Management 35: 137–144 17. Gartner IT Glossary (n.d.). Retrieved from http://www.gartner.com/it-glossary/big-data/ 18. Gema B., Jason J., and Jung B. 2016. Social big data: Recent achievements and new challenges. Information Fusion 28: 45–59 19. I B M c o r p o r a t i o n w e b s i t e . (https://www.ibm.com/marketplace/cloud/watsonanalytics/purchase/us/en-us#product-header-top) 20. Imad, M., Kelly, B. 2014. The unbeatable random walk in exchange rate forecasting: Reality or myth? Journal of Macroeconomics 40: 69–81 21. Jagadish H.V. 2015. Big Data and Science: Myths and Reality. Big Data Research 2: 49– 52 22. Jianzheng L., Jie L., Weifeng L., and Jiansheng W. 2016. Rethinking big data: A review on the data quality and usage issues. ISPRS Journal of Photogrammetry and Remote Sensing 115: 134–142 23. Joe F., and Hair J. 2007. Knowledge creation in marketing: the role of predictive analytics. European Business Review. 19 (4): 303 – 315 24. Johan B., Huina M., and Xiaojun Z. 2015. Twitter mood predicts the stock market Journal of Computational Science 2: 1–8. 25. KNIME website. Documentation. (https://tech.knime.org/documentation) 26. Konishi, S., and Kitagawa, G. 2007. Information Criteria and Statistical Modeling New York: Springer. 27. Kwan, M. 2014. Big Data's Impact on Trading and Technology. Journal of Trading 9 (1): 54-56 28. Kyunghee Y., and Lucas L. 2015. Big Data as Complementary Audit Evidence. Accounting Horizons 29: 431-438 29. Mark, E. 2006. Fragile digital data in danger of fading past history’s reach. Atlanta Journal Constitution, June 7, p. A1. 30. Matlis J. 2006. Predictive Analytics. Computerworld, October 9 31. Meese, R., Rogoff, K. 1983. Empirical exchange rate models of the seventies: do they fit out of sample? Journal of International Economy 14: 3–24. 32. Microsoft Azure Machine Learning Website. (https://azure.microsoft.com/engb/pricing/details/machine-learning/) 33. Min C., and Stewart R. 2015 Big Data Analytics in Financial Statement Audits. Accounting Horizons. 29 (2): 423-429 49

34. Niemira, P.M., and G. F. Zukowski. 1998. Trading the Fundamentals. New York. John Willey & Sons Inc. 35. Overview diagram of Microsoft Azure Machine Learning Capabilities. 2016. (https://azure.microsoft.com/en-us/documentation/articles/machine-learning-studio-overviewdiagram/) 36. Pozi, S. 2014. Big Data, Big Opportunities. Best’s Reviews. March. 37. Ramulkan, R. 2015. Financial Executive. 38. RapidMiner Studio Manual. 2014. (http://docs.rapidminer.com/downloads/RapidMinerv6-user-manual.pdf) 39. RapidMiner website. (https://rapidminer.com/products/comparison/) 40. Ruta, D. 2014. Automated trading with machine learning on big data. IEEE Computer Society: 824-30 41. SAP. Big Data and Smart Trading: How a Real-time Data Platform Maximizes Trading Opportunities; 2012. 42. SAS Website. Product Documentation. (http://support.sas.com/documentation/index.html) 43. Schroeck, M., Shockley, R., Smart, J., Romero-Morales, D., and Tufano, P. 2012. Analytics: The real-world use of big data. How innovative enterprises extract value from uncertain data. IBM Institute for Business Value. Retrieved from http://www03.ibm.com/systems/hu/resources/the real word use of big data.pdf 44. Schwager, J. D. 1996. Getting Started in Technical Analysis. New York. John Willey & Sons Inc. 45. Shmueli, G. 2010. To Explain or to Predict? Statistical Science 25 (3): 289-310. 46. Smith, K. 2015. Big Data Discoveries. Best’s Reviews. November. 47. Spandan G., Soham R., and Satyajit C. 2014. News Analytics and Sentiment Analysis to Predict Stock Price Trends. International Journal of Computer Science and Information Technologies 5 (3): 3595-3604 48. Srivastava, U., and Gopalkrishnan S. 2015. Impact of Big Data Analytics on Banking Sector: Learning for Indian Banks. Procedia Computer Science 50: 643-52. 49. TechAmerica Foundation’s Federal Big Data Commission. 2012. Demystifying big data: A practical guide to transforming the business of Government. Retrieved from http://www.techamerica.org/Docs/fileManager.cfm?f=techamerica-bigdatareport-final.pdf 50. Ventana Research. 2016. Five keys to choosing a comprehensive analytics platform. 51. Ventana Research. 2016. Gaining the edge in Banking with Business Analytics. 52. Ventana Research. 2016. Perspective Business Analytics. 50

53. Wu, H., Harris W., Gongjun Y., Vasudeva Akula C., Jiancheng S. 2015. A novel social media competitive analytics framework with sentiment benchmarks. Information & Management 52: 801–812 54. Xinhui T., Rui H., Lei W., Gang L., and Jianfeng Z. 2015. Latency critical big data computing in finance. The Journal of Finance and Data Science 1: 33-41 Appendix 1. Specifications of Models. Model 1: OLS, using observations 2010-02-01:2016-03-21 (T = 321) Dependent variable: USDCAD Coefficient Std. Error 1.46591 0.0121235 −0.0041492 0.000127235 const Brent Mean dependent var Sum squared resid R-squared F(1, 319) Log-likelihood Schwarz criterion rho 1.084380 1.034060 0.769249 1063.443 465.4615 −919.3801 0.970162 t-ratio 120.9144 −32.6105 S.D. dependent var S.E. of regression Adjusted R-squared P-value(F) Akaike criterion Hannan-Quinn Durbin-Watson p-value <0.0001 <0.0001 *** *** 0.118339 0.056935 0.768526 1.3e-103 −926.9230 −923.9113 0.050208 Model 2: OLS, using observations 2010-02-01:2016-03-21 (T = 321) Dependent variable: USDRUB Coefficient 83.0842 −0.48308 const Brent Mean dependent var Sum squared resid R-squared F(1, 319) Log-likelihood Schwarz criterion rho Std. Error 1.40507 0.014746 38.66361 13889.34 0.770869 1073.216 −1060.153 2131.848 0.959764 t-ratio 59.1318 −32.7600 S.D. dependent var S.E. of regression Adjusted R-squared P-value(F) Akaike criterion Hannan-Quinn Durbin-Watson p-value <0.0001 <0.0001 *** *** 13.76334 6.598503 0.770151 4.4e-104 2124.305 2127.317 0.057043 Model 2: OLS, using observations 2014-11-10:2016-03-21 (T = 72) Dependent variable: USDNOK const Coefficient 10.1061 Std. Error 0.126586 t-ratio 79.8355 p-value <0.0001 *** 51

Brent −0.039167 Mean dependent var Sum squared resid R-squared F(1, 70) Log-likelihood Schwarz criterion rho 0.00237413 8.067018 3.766776 0.795420 272.1652 4.052507 0.448319 0.802283 −16.4974 S.D. dependent var S.E. of regression Adjusted R-squared P-value(F) Akaike criterion Hannan-Quinn Durbin-Watson <0.0001 *** 0.509242 0.231972 0.792498 8.06e-26 −4.105013 −2.292318 0.378347 Model 1: OLS, using observations 2014-11-10:2016-03-21 (T = 72) Dependent variable: USDZAR const GOLD Mean dependent var Sum squared resid R-squared F(1, 70) Log-likelihood Schwarz criterion rho Coefficient 26.4943 −0.011423 Std. Error 3.52805 0.00301772 13.15584 152.4363 0.169913 14.32849 −129.1665 266.8863 0.983043 t-ratio 7.5096 −3.7853 S.D. dependent var S.E. of regression Adjusted R-squared P-value(F) Akaike criterion Hannan-Quinn Durbin-Watson p-value <0.0001 0.0003 *** *** 1.608249 1.475690 0.158054 0.000321 262.3330 264.1457 0.063645 Model 1: OLS, using observations 2015-02-03:2016-03-04 (T = 284) Dependent variable: BAC Coefficient Std. Error 0.000181202 0.000738767 1.32574 0.0715066 const SP Mean dependent var Sum squared resid R-squared F(1, 282) Log-likelihood Schwarz criterion rho −0.002243 0.042341 0.549330 343.7346 848.1800 −1685.062 0.060132 t-ratio 0.2453 18.5401 S.D. dependent var S.E. of regression Adjusted R-squared P-value(F) Akaike criterion Hannan-Quinn Durbin-Watson p-value 0.8064 <0.0001 *** 0.018220 0.012253 0.547732 9.99e-51 −1692.360 −1689.434 1.879509 Model 2: OLS, using observations 2015-02-03:2016-03-04 (T = 284) Dependent variable: IBM const Coefficient Std. Error −0.00033603 0.000621024 3 t-ratio −0.5411 p-value 0.5889 52

SP 0.919689 Mean dependent var Sum squared resid R-squared F(1, 282) Log-likelihood Schwarz criterion rho 0.06011 −0.002018 0.029920 0.453587 234.0931 897.4862 −1783.674 0.041344 15.3001 <0.0001 S.D. dependent var S.E. of regression Adjusted R-squared P-value(F) Akaike criterion Hannan-Quinn Durbin-Watson *** 0.013910 0.010300 0.451649 6.87e-39 −1790.972 −1788.046 1.913474 Model 3: OLS, using observations 2015-02-03:2016-03-04 (T = 284) Dependent variable: MSFT Coefficient Std. Error 0.00130152 0.000748926 1.13318 0.0724899 const SP Mean dependent var Sum squared resid R-squared F(1, 282) Log-likelihood Schwarz criterion rho −0.000771 0.043513 0.464251 244.3658 844.3013 −1677.305 0.001750 t-ratio 1.7378 15.6322 p-value 0.0833 <0.0001 S.D. dependent var S.E. of regression Adjusted R-squared P-value(F) Akaike criterion Hannan-Quinn Durbin-Watson * *** 0.016941 0.012422 0.462351 4.22e-40 −1684.603 −1681.677 1.995791 Model 4: OLS, using observations 2015-02-03:2016-03-04 (T = 284) Dependent variable: PG const SP Coefficient Std. Error −0.00079077 0.000448067 5 0.671332 Mean dependent var Sum squared resid R-squared F(1, 282) Log-likelihood Schwarz criterion rho 0.0433692 −0.002018 0.015575 0.459370 239.6139 990.1916 −1969.085 0.108871 t-ratio −1.7649 p-value 0.0787 * 15.4795 <0.0001 *** S.D. dependent var S.E. of regression Adjusted R-squared P-value(F) Akaike criterion Hannan-Quinn Durbin-Watson 0.010090 0.007432 0.457453 1.52e-39 −1976.383 −1973.457 1.781336 Model 5: OLS, using observations 2015-02-03:2016-03-04 (T = 284) Dependent variable: Wallmart const Coefficient Std. Error −0.0015747 0.000696924 t-ratio −2.2595 p-value 0.0246 ** 53

SP 0.592545 Mean dependent var Sum squared resid R-squared F(1, 282) Log-likelihood Schwarz criterion rho 0.0674565 −0.002658 0.037680 0.214836 77.16046 864.7388 −1718.180 0.059662 8.7841 S.D. dependent var S.E. of regression Adjusted R-squared P-value(F) Akaike criterion Hannan-Quinn Durbin-Watson <0.0001 *** 0.013022 0.011559 0.212051 1.56e-16 −1725.478 −1722.552 1.878631 Model 6: OLS, using observations 2015-02-03:2016-03-04 (T = 284) Dependent variable: Apple Coefficient Std. Error −5.80202e- 0.000709582 06 1.14417 0.0686817 const SP Mean dependent var Sum squared resid R-squared F(1, 282) Log-likelihood Schwarz criterion rho −0.002098 0.039062 0.495999 277.5229 859.6272 −1707.956 −0.100408 t-ratio −0.0082 p-value 0.9935 16.6590 <0.0001 S.D. dependent var S.E. of regression Adjusted R-squared P-value(F) Akaike criterion Hannan-Quinn Durbin-Watson *** 0.016549 0.011769 0.494212 7.41e-44 −1715.254 −1712.329 2.195459 Appendix 2. Specification of models, suggested by Watson Analytics Model 2: OLS, using observations 2015-01-30:2015-09-10 (T = 224) Dependent variable: EURUSD const Footse100 USDNOK Mean dependent var Sum squared resid R-squared F(2, 221) Log-likelihood Schwarz criterion rho Coefficient 2.43518 −8.46547e05 −0.0954061 Std. Error 0.0506615 3.6118e-06 t-ratio 48.0676 −23.4384 p-value <0.0001 <0.0001 *** *** 0.00357196 −26.7098 <0.0001 *** 1.101956 0.027962 0.764319 358.3535 688.8764 −1361.518 0.811820 S.D. dependent var S.E. of regression Adjusted R-squared P-value(F) Akaike criterion Hannan-Quinn Durbin-Watson 0.023066 0.011248 0.762186 4.38e-70 −1371.753 −1367.621 0.380038 Model 3: OLS, using observations 2015-01-30:2015-09-10 (T = 224) Dependent variable: USDJPY 54

const BankAmerica Coke Mean dependent var Sum squared resid R-squared F(2, 221) Log-likelihood Schwarz criterion rho Coefficient 111.292 1.72864 −0.451838 Std. Error 2.97802 0.062178 0.0582075 120.0950 379.9253 0.846280 608.3391 −377.0150 770.2650 0.883048 t-ratio 37.3710 27.8014 −7.7625 S.D. dependent var S.E. of regression Adjusted R-squared P-value(F) Akaike criterion Hannan-Quinn Durbin-Watson p-value <0.0001 <0.0001 <0.0001 *** *** *** 3.329135 1.311152 0.844889 1.36e-90 760.0301 764.1614 0.230867 Model 4: OLS, using observations 2015-01-30:2015-09-10 (T = 224) Dependent variable: USDJPY const BankAmerica Gold Mean dependent var Sum squared resid R-squared F(2, 221) Log-likelihood Schwarz criterion rho Coefficient Std. Error 103.254 2.96398 1.73662 0.0716346 −0.00945173 0.00189369 120.0950 434.5323 0.824186 518.0038 −392.0561 800.3472 0.917690 t-ratio 34.8363 24.2427 −4.9912 S.D. dependent var S.E. of regression Adjusted R-squared P-value(F) Akaike criterion Hannan-Quinn Durbin-Watson p-value <0.0001 <0.0001 <0.0001 *** *** *** 3.329135 1.402216 0.822595 3.79e-84 790.1123 794.2436 0.176774 Model 5: OLS, using observations 2015-01-30:2015-09-10 (T = 224) Dependent variable: USDJPY const Gold NKK225 Mean dependent var Sum squared resid R-squared F(2, 221) Log-likelihood Schwarz criterion rho Coefficient Std. Error 109.078 2.38948 −0.020616 0.00148216 0.00184703 6.42593e-05 120.0950 335.5763 0.864224 703.3392 −363.1130 742.4610 0.862122 t-ratio 45.6492 −13.9094 28.7434 S.D. dependent var S.E. of regression Adjusted R-squared P-value(F) Akaike criterion Hannan-Quinn Durbin-Watson p-value <0.0001 <0.0001 <0.0001 *** *** *** 3.329135 1.232252 0.862995 1.50e-96 732.2260 736.3573 0.271155 Dependent variable: USDNOK Coefficient Std. Error t-ratio p-value 55

const Gold Brent 11.834 0.175394 −0.00182426 0.000159866 −0.0299062 0.000884068 Mean dependent var Sum squared resid R-squared F(2, 221) Log-likelihood Schwarz criterion rho 8.245108 3.847638 0.886243 860.8666 137.3467 −258.4584 0.845472 67.4711 −11.4112 −33.8280 S.D. dependent var S.E. of regression Adjusted R-squared P-value(F) Akaike criterion Hannan-Quinn Durbin-Watson <0.0001 <0.0001 <0.0001 *** *** *** 0.389453 0.131947 0.885213 4.8e-105 −268.6934 −264.5620 0.303042 Model 7: OLS, using observations 2015-01-30:2015-09-10 (T = 224) Dependent variable: USDNOK const EURUSD Footse100 Mean dependent var Sum squared resid R-squared F(2, 221) Log-likelihood Schwarz criterion rho Coefficient Std. Error 22.7357 0.356379 −8.0025 0.299609 −0.00087850 1.79426e-05 8 8.245108 2.345393 0.930657 1483.035 192.7874 −369.3398 0.789141 t-ratio 63.7964 −26.7098 −48.9621 S.D. dependent var S.E. of regression Adjusted R-squared P-value(F) Akaike criterion Hannan-Quinn Durbin-Watson p-value <0.0001 <0.0001 <0.0001 *** *** *** 0.389453 0.103018 0.930030 8.5e-129 −379.5748 −375.4434 0.421785 Model 8: OLS, using observations 2015-01-30:2015-09-10 (T = 224) Dependent variable: USDRUB const Brent Shanghai Mean dependent var Sum squared resid R-squared F(2, 221) Log-likelihood Schwarz criterion rho Coefficient Std. Error 104.251 0.960429 −0.564791 0.0220999 −0.00349858 0.000409167 63.87961 1152.483 0.918179 1240.007 −501.3014 1018.838 0.903221 t-ratio 108.5464 −25.5563 −8.5505 S.D. dependent var S.E. of regression Adjusted R-squared P-value(F) Akaike criterion Hannan-Quinn Durbin-Watson p-value <0.0001 <0.0001 <0.0001 *** *** *** 7.947528 2.283606 0.917438 7.4e-121 1008.603 1012.734 0.178160 Model 9: OLS, using observations 2015-01-30:2015-09-10 (T = 224) Dependent variable: USDRUB Coefficient Std. Error t-ratio p-value 56

const Brent RTSI 118.242 −0.398387 −0.0406336 Mean dependent var Sum squared resid R-squared F(2, 221) Log-likelihood Schwarz criterion rho 1.30709 0.0219356 0.00247756 63.87961 691.7760 0.950887 2139.413 −444.1352 904.5054 0.860782 90.4620 −18.1617 −16.4007 S.D. dependent var S.E. of regression Adjusted R-squared P-value(F) Akaike criterion Hannan-Quinn Durbin-Watson <0.0001 <0.0001 <0.0001 *** *** *** 7.947528 1.769239 0.950443 2.4e-145 894.2705 898.4018 0.275189 Model 10: OLS, using observations 2015-01-30:2015-09-10 (T = 224) Dependent variable: USDZAR const Brent NaturalGas Mean dependent var Sum squared resid R-squared F(2, 221) Log-likelihood Schwarz criterion rho Coefficient 21.3444 −0.108871 −0.977925 Std. Error 0.213486 0.00500692 0.143414 13.55606 48.50850 0.907556 1084.815 −146.4926 309.2202 0.879994 t-ratio 99.9806 −21.7442 −6.8189 S.D. dependent var S.E. of regression Adjusted R-squared P-value(F) Akaike criterion Hannan-Quinn Durbin-Watson p-value <0.0001 <0.0001 <0.0001 *** *** *** 1.533969 0.468503 0.906719 5.4e-115 298.9853 303.1166 0.205604 Model 11: OLS, using observations 2015-01-30:2015-09-10 (T = 224) Dependent variable: USDZAR Coefficient Std. Error 19.7559 0.68409 −0.137358 0.00344813 0.000517846 0.000623526 const Brent Gold Mean dependent var Sum squared resid R-squared F(2, 221) Log-likelihood Schwarz criterion rho 13.55606 58.53171 0.888454 880.1244 −167.5296 351.2940 0.898736 t-ratio 28.8792 −39.8356 0.8305 S.D. dependent var S.E. of regression Adjusted R-squared P-value(F) Akaike criterion Hannan-Quinn Durbin-Watson p-value <0.0001 <0.0001 0.4071 *** *** 1.533969 0.514635 0.887445 5.5e-106 341.0591 345.1904 0.164108 Model 15: OLS, using observations 2015-01-30:2015-09-10 (T = 224) Dependent variable: IBM 57

const Brent NKK225 Mean dependent var Sum squared resid R-squared F(2, 221) Log-likelihood Schwarz criterion rho Coefficient Std. Error 78.6593 5.23793 1.19685 0.043626 0.000636722 0.000342023 149.9098 5647.317 0.874521 770.1224 −679.2987 1374.832 0.917691 t-ratio 15.0173 27.4344 1.8616 S.D. dependent var S.E. of regression Adjusted R-squared P-value(F) Akaike criterion Hannan-Quinn Durbin-Watson p-value <0.0001 <0.0001 0.0640 *** *** * 14.20635 5.055044 0.873385 2.5e-100 1364.597 1368.729 0.194187 Model 16: OLS, using observations 2015-01-30:2015-09-10 (T = 224) Dependent variable: IBM const NASDAQ100 USDZAR Mean dependent var Sum squared resid R-squared F(2, 221) Log-likelihood Schwarz criterion rho Coefficient 241.98 0.00448579 −8.25771 Std. Error 13.0694 0.00264733 0.273081 149.9098 8215.836 0.817450 494.8132 −721.2856 1458.806 0.938997 t-ratio 18.5150 1.6945 −30.2391 S.D. dependent var S.E. of regression Adjusted R-squared P-value(F) Akaike criterion Hannan-Quinn Durbin-Watson p-value <0.0001 0.0916 <0.0001 *** * *** 14.20635 6.097190 0.815798 2.41e-82 1448.571 1452.702 0.120422 Model 22: OLS, using observations 2015-01-30:2015-09-10 (T = 224) Dependent variable: Apple const Brent NASDAQ100 Mean dependent var Sum squared resid R-squared F(2, 221) Log-likelihood Schwarz criterion rho const Coefficient −9.95716 0.879771 0.0186525 Std. Error 5.24672 0.0182226 0.00122047 116.2154 1712.286 0.934771 1583.529 −545.6434 1107.522 0.829285 Coefficient 191.434 t-ratio −1.8978 48.2790 15.2831 S.D. dependent var S.E. of regression Adjusted R-squared P-value(F) Akaike criterion Hannan-Quinn Durbin-Watson Std. Error 9.74956 t-ratio 19.6351 p-value 0.0590 <0.0001 <0.0001 * *** *** 10.84963 2.783505 0.934181 9.9e-132 1097.287 1101.418 0.335655 p-value <0.0001 *** 58

USDZAR NKK225 Mean dependent var Sum squared resid R-squared F(2, 221) Log-likelihood Schwarz criterion rho −5.50512 0.303515 −0.00250334 0.000344423 69.57540 5900.174 0.627249 185.9442 −684.2044 1384.644 0.953880 −18.1379 −7.2682 S.D. dependent var S.E. of regression Adjusted R-squared P-value(F) Akaike criterion Hannan-Quinn Durbin-Watson <0.0001 <0.0001 *** *** 8.425007 5.166974 0.623875 4.38e-48 1374.409 1378.540 0.091654 Model 24: OLS, using observations 2015-01-30:2015-09-10 (T = 224) Dependent variable: Wallmart const NKK225 Footse100 Mean dependent var Sum squared resid R-squared F(2, 221) Log-likelihood Schwarz criterion rho Coefficient Std. Error −33.1095 4.52621 −0.00304983 0.000261806 0.0248156 0.000918855 69.57540 3414.405 0.784290 401.7626 −622.9428 1262.121 0.903227 t-ratio −7.3151 −11.6492 27.0071 S.D. dependent var S.E. of regression Adjusted R-squared P-value(F) Akaike criterion Hannan-Quinn Durbin-Watson p-value <0.0001 <0.0001 <0.0001 *** *** *** 8.425007 3.930623 0.782338 2.47e-74 1251.886 1256.017 0.187717 Model 25: OLS, using observations 2015-01-30:2015-09-10 (T = 224) Dependent variable: Wallmart const Brent USDJPY Mean dependent var Sum squared resid R-squared F(2, 221) Log-likelihood Schwarz criterion rho Coefficient 151.821 0.695071 −0.971299 Std. Error 13.7927 0.0380797 0.121221 69.57540 6309.686 0.601377 166.7043 −691.7201 1399.675 0.952624 t-ratio 11.0074 18.2531 −8.0126 S.D. dependent var S.E. of regression Adjusted R-squared P-value(F) Akaike criterion Hannan-Quinn Durbin-Watson p-value <0.0001 <0.0001 <0.0001 *** *** *** 8.425007 5.343278 0.597770 7.28e-45 1389.440 1393.571 0.078185 Model 3: OLS, using observations 1960-01-01:1960-08-11 (T = 224) Dependent variable: Coke Coefficient Std. Error t-ratio p-value 59

const USDCNY Gold Mean dependent var Sum squared resid R-squared F(2, 221) Log-likelihood Schwarz criterion rho −18.4188 7.84844 0.00890342 4.88511 0.668813 0.00149152 41.68634 362.1419 0.407077 75.86486 −371.6459 759.5268 0.950880 −3.7704 11.7349 5.9693 S.D. dependent var S.E. of regression Adjusted R-squared P-value(F) Akaike criterion Hannan-Quinn Durbin-Watson 0.0002 <0.0001 <0.0001 *** *** *** 1.654961 1.280098 0.401711 8.25e-26 749.2918 753.4232 0.121149 Model 4: OLS, using observations 1960-01-01:1960-08-11 (T = 224) Dependent variable: EURUSD Coefficient Std. Error 1.02361 0.0261078 0.000299935 2.50379e-05 −0.00339848 0.00035657 const Gold PG Mean dependent var Sum squared resid R-squared F(2, 221) Log-likelihood Schwarz criterion rho 1.101956 0.070045 0.409610 76.66453 586.0265 −1155.818 0.883499 t-ratio 39.2069 11.9793 −9.5310 S.D. dependent var S.E. of regression Adjusted R-squared P-value(F) Akaike criterion Hannan-Quinn Durbin-Watson p-value <0.0001 <0.0001 <0.0001 *** *** *** 0.023066 0.017803 0.404267 5.14e-26 −1166.053 −1161.922 0.238517 Model 6: OLS, using observations 1960-01-01:1960-08-11 (T = 224) Dependent variable: USDCNY const Brent Shanghai Mean dependent var Sum squared resid R-squared F(2, 221) Log-likelihood Schwarz criterion rho Coefficient Std. Error 6.94919 0.0176275 −0.0110162 0.000405616 −1.60684e- 7.50977e-06 05 6.346921 0.388227 0.898085 973.7338 394.2327 −772.2305 0.880092 t-ratio 394.2246 −27.1592 −2.1397 S.D. dependent var S.E. of regression Adjusted R-squared P-value(F) Akaike criterion Hannan-Quinn Durbin-Watson p-value <0.0001 <0.0001 0.0335 *** *** ** 0.130698 0.041913 0.897162 2.6e-110 −782.4655 −778.3341 0.229048 Model 7: OLS, using observations 1960-01-01:1960-08-11 (T = 224) Dependent variable: USDCNY 60

const Brent NKK225 Mean dependent var Sum squared resid R-squared F(2, 221) Log-likelihood Schwarz criterion rho Coefficient Std. Error 7.22139 0.0385345 −0.00990401 0.000320948 −2.03678e- 2.5162e-06 05 6.346921 0.305649 0.919763 1266.665 421.0179 −825.8008 0.844110 t-ratio 187.4007 −30.8586 −8.0946 S.D. dependent var S.E. of regression Adjusted R-squared P-value(F) Akaike criterion Hannan-Quinn Durbin-Watson p-value <0.0001 <0.0001 <0.0001 *** *** *** 0.130698 0.037189 0.919037 8.6e-122 −836.0358 −831.9045 0.288563 Model 8: OLS, using observations 1960-01-01:1960-08-11 (T = 224) Dependent variable: USDJPY const BankAmerica Gold Mean dependent var Sum squared resid R-squared F(2, 221) Log-likelihood Schwarz criterion rho Coefficient Std. Error 103.254 2.96398 1.73662 0.0716346 −0.00945173 0.00189369 120.0950 434.5323 0.824186 518.0038 −392.0561 800.3472 0.917690 t-ratio 34.8363 24.2427 −4.9912 S.D. dependent var S.E. of regression Adjusted R-squared P-value(F) Akaike criterion Hannan-Quinn Durbin-Watson p-value <0.0001 <0.0001 <0.0001 *** *** *** 3.329135 1.402216 0.822595 3.79e-84 790.1123 794.2436 0.176774 Model 9: OLS, using observations 1960-01-01:1960-08-11 (T = 224) Dependent variable: USDJPY const Gold NKK225 Mean dependent var Sum squared resid R-squared F(2, 221) Log-likelihood Schwarz criterion rho Coefficient Std. Error 109.078 2.38948 −0.020616 0.00148216 0.00184703 6.42593e-05 120.0950 335.5763 0.864224 703.3392 −363.1130 742.4610 0.862122 t-ratio 45.6492 −13.9094 28.7434 S.D. dependent var S.E. of regression Adjusted R-squared P-value(F) Akaike criterion Hannan-Quinn Durbin-Watson p-value <0.0001 <0.0001 <0.0001 *** *** *** 3.329135 1.232252 0.862995 1.50e-96 732.2260 736.3573 0.271155 61

Model 10: OLS, using observations 1960-01-01:1960-08-11 (T = 224) Dependent variable: USDNOK const Gold NaturalGas Mean dependent var Sum squared resid R-squared F(2, 221) Log-likelihood Schwarz criterion rho Coefficient Std. Error 14.2993 0.2173 −0.00358985 0.00017834 −0.776106 0.0282488 8.245108 5.383486 0.840835 583.7466 99.72848 −183.2220 0.835213 t-ratio 65.8042 −20.1293 −27.4739 S.D. dependent var S.E. of regression Adjusted R-squared P-value(F) Akaike criterion Hannan-Quinn Durbin-Watson p-value <0.0001 <0.0001 <0.0001 *** *** *** 0.389453 0.156076 0.839394 6.37e-89 −193.4570 −189.3256 0.328522 Model 11: OLS, using observations 1960-01-01:1960-08-11 (T = 224) Dependent variable: USDNOK const NaturalGas Brent Mean dependent var Sum squared resid R-squared F(2, 221) Log-likelihood Schwarz criterion rho Coefficient 9.92649 −0.0331292 −0.0323285 Std. Error 0.0757235 0.0508692 0.00177596 8.245108 6.102984 0.819562 501.8999 85.67901 −155.1231 0.888607 t-ratio 131.0886 −0.6513 −18.2034 S.D. dependent var S.E. of regression Adjusted R-squared P-value(F) Akaike criterion Hannan-Quinn Durbin-Watson p-value <0.0001 0.5156 <0.0001 *** *** 0.389453 0.166179 0.817929 6.67e-83 −165.3580 −161.2267 0.208145 Model 12: OLS, using observations 1960-01-01:1960-08-11 (T = 224) Dependent variable: USDZAR const NaturalGas Brent Mean dependent var Sum squared resid R-squared F(2, 221) Log-likelihood Schwarz criterion rho Coefficient 21.3444 −0.977925 −0.108871 Std. Error 0.213486 0.143414 0.00500692 13.55606 48.50850 0.907556 1084.815 −146.4926 309.2202 0.879994 t-ratio 99.9806 −6.8189 −21.7442 S.D. dependent var S.E. of regression Adjusted R-squared P-value(F) Akaike criterion Hannan-Quinn Durbin-Watson p-value <0.0001 <0.0001 <0.0001 *** *** *** 1.533969 0.468503 0.906719 5.4e-115 298.9853 303.1166 0.205604 Model 14: OLS, using observations 1960-01-01:1960-08-11 (T = 224) Dependent variable: USDRUB 62

const Brent Shanghai Mean dependent var Sum squared resid R-squared F(2, 221) Log-likelihood Schwarz criterion rho Coefficient Std. Error 104.251 0.960429 −0.564791 0.0220999 −0.00349858 0.000409167 63.87961 1152.483 0.918179 1240.007 −501.3014 1018.838 0.903221 t-ratio 108.5464 −25.5563 −8.5505 S.D. dependent var S.E. of regression Adjusted R-squared P-value(F) Akaike criterion Hannan-Quinn Durbin-Watson p-value <0.0001 <0.0001 <0.0001 *** *** *** 7.947528 2.283606 0.917438 7.4e-121 1008.603 1012.734 0.178160 Model 2: OLS, using observations 1960-01-01:1960-09-19 (T = 225) Dependent variable: Apple const DJI USDCNY Mean dependent var Sum squared resid R-squared F(2, 222) Log-likelihood Schwarz criterion rho Coefficient Std. Error 390.097 20.0778 0.00533233 0.000430637 −57.7748 2.19356 116.1877 1890.653 0.928082 1432.419 −558.7261 1133.700 0.830495 t-ratio 19.4293 12.3824 −26.3384 S.D. dependent var S.E. of regression Adjusted R-squared P-value(F) Akaike criterion Hannan-Quinn Durbin-Watson p-value <0.0001 <0.0001 <0.0001 *** *** *** 10.83334 2.918297 0.927434 1.3e-127 1123.452 1127.588 0.333046 Model 1: OLS, using observations 1960-01-01:1961-03-27 (T = 322) Dependent variable: USDCAD Coefficient Std. Error 1.4658 0.0120031 −0.00414812 0.000126136 const Brent Mean dependent var Sum squared resid R-squared F(1, 320) Log-likelihood Schwarz criterion rho 1.085059 1.034076 0.771673 1081.499 467.4097 −923.2704 0.969058 t-ratio 122.1180 −32.8862 S.D. dependent var S.E. of regression Adjusted R-squared P-value(F) Akaike criterion Hannan-Quinn Durbin-Watson p-value <0.0001 <0.0001 *** *** 0.118781 0.056846 0.770960 1.2e-104 −930.8195 −927.8056 0.051482 Model 3: OLS, using observations 1960-01-01:1960-11-10 (T = 225) Dependent variable: ExxonMobil const Footse100 Gold Coefficient Std. Error −4.57266 3.76875 0.00725082 0.000467742 0.0339528 0.00306931 t-ratio −1.2133 15.5017 11.0621 p-value 0.2263 <0.0001 <0.0001 *** *** 63

Mean dependent var Sum squared resid R-squared F(2, 222) Log-likelihood Schwarz criterion rho 81.49080 1412.159 0.712977 275.7292 −525.8983 1068.045 0.884229 S.D. dependent var S.E. of regression Adjusted R-squared P-value(F) Akaike criterion Hannan-Quinn Durbin-Watson 4.686620 2.522117 0.710392 6.74e-61 1057.797 1061.933 0.232393 Model 4: OLS, using observations 1960-01-01:1960-11-10 (T = 225) Dependent variable: ExxonMobil Coefficient −46.8102 0.0405594 0.00467673 const Gold DJI Mean dependent var Sum squared resid R-squared F(2, 222) Log-likelihood Schwarz criterion rho Std. Error 3.93945 0.00226864 0.00019986 81.49080 848.3353 0.827575 532.7585 −468.5684 953.3851 0.861211 t-ratio −11.8824 17.8782 23.4000 S.D. dependent var S.E. of regression Adjusted R-squared P-value(F) Akaike criterion Hannan-Quinn Durbin-Watson p-value <0.0001 <0.0001 <0.0001 *** *** *** 4.686620 1.954822 0.826022 1.83e-85 943.1368 947.2731 0.281813 Model 5: OLS, using observations 1960-01-01:1960-11-10 (T = 225) Dependent variable: ExxonMobil Coefficient −53.5047 0.0398747 0.0463417 const SP500 Gold Mean dependent var Sum squared resid R-squared F(2, 222) Log-likelihood Schwarz criterion rho Std. Error 5.14368 0.00217159 0.00262328 81.49080 1167.540 0.762696 356.7554 −504.4985 1025.245 0.893820 t-ratio −10.4020 18.3620 17.6655 S.D. dependent var S.E. of regression Adjusted R-squared P-value(F) Akaike criterion Hannan-Quinn Durbin-Watson p-value <0.0001 <0.0001 <0.0001 *** *** *** 4.686620 2.293292 0.760559 4.56e-70 1014.997 1019.133 0.215242 Model 9: OLS, using observations 1960-01-01:1960-11-10 (T = 225) Dependent variable: PG const USDJPY Brent Mean dependent var Sum squared resid Coefficient 151.348 −0.683958 0.197202 Std. Error 8.77858 0.0772603 0.0245568 78.98476 2636.217 t-ratio 17.2406 −8.8526 8.0304 S.D. dependent var S.E. of regression p-value <0.0001 <0.0001 <0.0001 *** *** *** 4.117686 3.445990 64

R-squared F(2, 222) Log-likelihood Schwarz criterion rho 0.305893 48.91763 −596.1236 1208.496 0.949963 Adjusted R-squared P-value(F) Akaike criterion Hannan-Quinn Durbin-Watson 0.299639 2.50e-18 1198.247 1202.384 0.097331 Model 10: OLS, using observations 1960-01-01:1960-11-10 (T = 225) Dependent variable: BankAmerica const NaturalGas NASDAQ100 Mean dependent var Sum squared resid R-squared F(2, 222) Log-likelihood Schwarz criterion rho Coefficient Std. Error −19.479 1.29258 2.11465 0.123832 0.00683298 0.000290473 15.97807 103.6922 0.808380 468.2703 −232.1104 480.4692 0.908326 t-ratio −15.0699 17.0768 23.5236 S.D. dependent var S.E. of regression Adjusted R-squared P-value(F) Akaike criterion Hannan-Quinn Durbin-Watson p-value <0.0001 <0.0001 <0.0001 *** *** *** 1.554276 0.683434 0.806653 2.24e-80 470.2209 474.3571 0.212804 Model 11: OLS, using observations 1960-01-01:1960-11-10 (T = 225) Dependent variable: Chevron const Gold Footse100 Mean dependent var Sum squared resid R-squared F(2, 222) Log-likelihood Schwarz criterion rho Coefficient −76.5478 0.0393618 0.0192772 Std. Error 7.28786 0.0059353 0.000904501 93.40311 5280.668 0.754404 340.9617 −674.2783 1364.805 0.938045 t-ratio −10.5035 6.6318 21.3125 S.D. dependent var S.E. of regression Adjusted R-squared P-value(F) Akaike criterion Hannan-Quinn Durbin-Watson p-value <0.0001 <0.0001 <0.0001 *** *** *** 9.797375 4.877170 0.752191 2.06e-68 1354.557 1358.693 0.126310 Model 12: OLS, using observations 1960-01-01:1960-11-10 (T = 225) Dependent variable: Chevron const NASDAQ100 USDZAR Mean dependent var Sum squared resid R-squared F(2, 222) Log-likelihood Coefficient 119.11 0.0063668 −3.97567 Std. Error 15.9567 0.00323398 0.333502 93.40311 12353.99 0.425434 82.18939 −769.8950 t-ratio 7.4646 1.9687 −11.9210 S.D. dependent var S.E. of regression Adjusted R-squared P-value(F) Akaike criterion p-value <0.0001 0.0502 <0.0001 *** * *** 9.797375 7.459799 0.420258 1.94e-27 1545.790 65

Schwarz criterion rho 1556.038 0.979301 Hannan-Quinn Durbin-Watson 1549.926 0.042312 Model 16: OLS, using observations 1960-01-01:1960-11-10 (T = 225) Dependent variable: Coke const Gold NaturalGas Mean dependent var Sum squared resid R-squared F(2, 222) Log-likelihood Schwarz criterion rho Coefficient 43.2875 0.00589016 −3.42148 Std. Error 1.47685 0.00120946 0.191359 41.70929 250.2125 0.607398 171.7292 −331.2098 678.6679 0.908895 t-ratio 29.3108 4.8701 −17.8799 S.D. dependent var S.E. of regression Adjusted R-squared P-value(F) Akaike criterion Hannan-Quinn Durbin-Watson p-value <0.0001 <0.0001 <0.0001 *** *** *** 1.686764 1.061642 0.603861 8.49e-46 668.4196 672.5559 0.211163 Appendix 3. Results of the IBM Watson Analytics Predict function for currencies. Blue: P&G and NKK225 Green: P&G and Gold. 66

Blue: Brent and Shanghai Green: Brent and NKK 225 Blue: Bank America and Coke Green: Bank America and Gold 67

Blue: Brent and Gold Green: Bank Gold and Natural Gas Blue: Brent and Shanghai Green: Brent and CSI300 68

Appendix 4. Results of the IBM Watson Analytics Predict function for stocks. Blue: Brent and Nasdaq Green NKK225 and Brent Blue: Nasdaq and Natural Gas Green Shanghai and Natural Gas 69

Blue: Futsee 100 and Gold Green: DJI and Nasdaq Blue: Gold and Natural Gas Green: Futsee and Natural Gas 70

Blue: Futsee and Gold Green: DJI and Gold Blue: Gold and Brent Green: Brent and NKK 225 71

Blue: Futsee and NKK 225 Green: Futsee and Nasdaq 72

Рецензии:

Авторизуйтесь, чтобы добавить рецензию

- у работы пока нет рецензий -

Отзывы:

Авторизуйтесь, чтобы оставить отзыв