Big Data in Finance – Assignment 1
Algorithmic Trading Assignment
Objective:
Develop and perform algorithmic trades and their strategies using big data in finance.
Requirements:
You are required to do the data analysis in Python. The purpose of this document set
is to perform Big Data Science and artificial intelligence in financial data mining and
find out the similarity and differences between your findings and the results of other
researchers in journal papers.
Introduction:
Algorithmic trading has become an increasingly important tool in the financial markets,
allowing traders to leverage advanced data analysis and decision-making capabilities
to generate profits. In this assignment, you will be tasked with developing and
evaluating several algorithmic trading strategies for the Chinese or Hong Kong stock
market, or the currency exchange market or commodity products in different
commodity exchanges in the world.
Procedures:
1. Financial Instrument Selection and Portfolio Optimization Using Python
1. Stock Selection and Analysis
Requirements:
Use yfinance or Alpha Vantage API to fetch historical stock data.
Select 3–5 stocks based on criteria such as:
Volatility
Moving average crossover
Sector performance
Sample Starter Code:
python
import yfinance as yf
import matplotlib.pyplot as plt
tickers = ['AAPL', 'MSFT', 'GOOGL']
data = yf.download(tickers, start='2023-01-01', end='2023-12-
31')['Adj Close']
# Calculate 50-day and 200-day moving averages
ma_50 = data.rolling(window=50).mean()
ma_200 = data.rolling(window=200).mean()
# Plot
plt.figure(figsize=(12,6))
plt.plot(data['AAPL'], label='AAPL Price')
plt.plot(ma_50['AAPL'], label='50-day MA')
plt.plot(ma_200['AAPL'], label='200-day MA')
plt.legend()
plt.title('AAPL Stock Analysis')
plt.show()
2. Exchange Rate Monitoring
Requirements:
Use forex-python or exchangeratesapi.io to track FX rates.
Compare currency pairs (e.g., USD/JPY, EUR/USD) over time.
Page 1
Identify trends or arbitrage opportunities.
Sample Starter Code:
python
from forex_python.converter import CurrencyRates
import datetime
cr = CurrencyRates()
date = datetime.datetime(2023, 12, 1)
rate = cr.get_rate('USD', 'JPY', date)
print(f"USD to JPY on {date.date()}: {rate}")
3. Commodity Price Tracking
Requirements:
Use InvestPy, Quandl, or other sources to fetch commodity prices
(e.g., gold, oil, wheat).
Visualize price trends and compute simple indicators (e.g., RSI,
MACD).
Sample Starter Code:
python
import investpy
oil_data = investpy.get_commodity_historical_data(
commodity='crude oil',
from_date='01/01/2023',
to_date='31/12/2023',
interval='Daily'
)
print(oil_data.head())
4. Options, Futures, and Derivatives
Requirements:
Use yfinance or QuantLib to analyze options or futures contracts.
Calculate Greeks (Delta, Gamma, Theta) or simulate payoff
diagrams.
Sample Starter Code:
python
import yfinance as yf
option_chain = yf.Ticker('AAPL').options
print("Available Expiry Dates:", option_chain)
opt_data = yf.Ticker('AAPL').option_chain(option_chain[0])
calls = opt_data.calls
puts = opt_data.puts
print("Call Options:\n", calls.head())
5. Submit a Python code (.py) with:
i. Clear code comments
ii. Visualizations
iii. Summary of findings
2. Selection of Investment Portfolio for initial capital $1,000,000: (Textbook
Page 2
Ch11)
2.1 Select one business sector in accordance with Global Industry Classification
Standard in Appendix 1. Each student selects his/her own business sector and
no business sector should be repeated. Design with explanation at least 3
investment portfolios in the selected business sector including all together 10 items
of the following and 1 corresponding indictor for benchmarking:
1. relevant industrial stocks in China (ie. Shanghai, Shenzhen or Hong
Kong Stock Markets), for example,
45101010 Internet Software & Services (8-digit number only)
i. 9988.HK - Alibaba Group Holding Ltd.
ii. 0700.HK - Tencent Holdings Ltd.
iii. BIDU - Baidu, Inc.
iv. 9618.HK - JD.com, Inc.
v. PDD - PDD Holdings Inc 拼多多
vi. 600941.SS - China Moible Ltd 中國移動
vii. …
2. country or crypto currencies for investment portfolio,
i. USD/CNY
ii. EUR/CNY
iii. JPY/CNY
iv. GBP/CNY
v. AUD/CNY
vi. USD/BTC (Bitcoin)
3. commodity products in different commodity exchanges in the world,
i. Gold (XAUUSD)
ii. Silver (XAGUSD)
iii. Crude Oil (USOIL)
iv. Copper (XCUUSD)
v. Wheat (WHEATUSD)
4. or their options, futures and derivatives
2.2 Benchmark the Investment Portfolios to relevant indices, for examples
Stock Indices in China
Hang Seng Index
Shanghai Composite Index
SZSE Component Index
CSI 300 Index
SSE 50 Index
SSE 180 Index
SZSE 100 Index
SZSE 200 Index
1.3 More than 3 investment portfolios would be counted in the bonus marks.
3. Trading Strategies
1. Design with explanation the trading strategies as follows:
1. Single Indicator-Based Strategy (refer to the indicators in yahoo finance
advanced chart)
Develop a trading strategy that relies on a single technical indicator, such as the
Shanghai Composite Index's 50-day moving average, the Hang Seng Index's
Relative Strength Index (RSI), or the USD/CNY exchange rate's Stochastic
Oscillator. Explain the rationale behind your chosen indicator and how it can be
used to generate buy and sell signals.
Page 3
2. Multiple Indicator-Based Strategy
Create a trading strategy that combines multiple technical indicators to make
trading decisions. For example, you could use the 20-day and 50-day moving
averages of the Shenzhen Component Index, along with the MACD indicator,
to generate trading signals. Discuss how you selected the indicators and how
you integrated them into a cohesive decision-making framework.
3. Simple Neural Network AI Strategy (Textbook Ch7, Neural Network
with Radical Basic Function.txt)
Implement a simple neural network-based trading strategy using stock data from
the Shanghai Stock Exchange or the Hong Kong Stock Exchange, or currency
exchange rates. Describe the architecture of your neural network, the input
features used (e.g., price, volume, technical indicators), and the training process.
Explain how the neural network generates trading signals.
4. Hybrid Indicator-Based and Neural Network AI Strategy (2.2 + 2.3)
Develop a hybrid trading strategy that combines traditional technical indicators
(such as the 200-day moving average of the CSI 300 Index) with a neural
network-based model. Discuss the rationale for this approach and how the two
components are integrated to make trading decisions.
5. Simple Deep Learning AI Strategy (Textbook Ch15)
Design a deep learning-based trading strategy, such as using a recurrent neural
network (RNN) or a convolutional neural network (CNN) to analyze the
historical price and volume data of Chinese or Hong Kong stocks, or currency
exchange rates. Describe the model architecture, the input data, and the training
process. Explain how the deep learning model is used to generate trading signals.
6. Hybrid Indicator-Based and Deep Learning AI Strategy (2.2+2.5)
Implement a hybrid trading strategy that integrates traditional technical
indicators (e.g., the Bollinger Bands of the Hang Seng Index) with a deep
learning-based model. Explain the benefits of this approach and how the two
components work together to make trading decisions.
7. Customized Strategies
Customize at least one trading strategy to find out the optimal trading strategy
in your investment combinations. More than one trading strategy would be
counted in the bonus marks.
4. Backtesting (20241018 backtest using qstock3.py)
For each of the trading strategies developed, perform a comprehensive
backtesting process using at least two-years historical data from the Chinese or
Hong Kong stock market, the currency exchange market or different commodity
exchanges. This should include:
1. Data Preparation: Obtain and preprocess the necessary historical
market data for your trading strategies.
2. Backtesting Methodology: Describe the backtesting methodology you
will use, including the time period, the evaluation metrics (e.g., returns,
drawdown, Sharpe ratio), and any assumptions or constraints.
3. Backtesting Analytical Results: Present the backtesting results for each
trading strategy, including performance metrics, visualizations (e.g.,
equity curves), and a comparative analysis of the strategies. For example,
1. Total return ratio (eg. 1100000 / 1000000 = 1.1)
2. Sharpe ratio
3. Drawdown
4. Win/loss ratio
Page 4
4. Optimization and Sensitivity Analysis (Optional): Discuss any
optimization techniques you used to improve the performance of your
trading strategies, and conduct a sensitivity analysis to understand the
impact of key parameters on the strategy's performance.
5. Real-Time Live Simulation
To further evaluate the effectiveness of your trading strategies, implement a
real-time live simulation using current market data from the Chinese or Hong
Kong stock market, or the currency exchange market in consecutive 5 trading
days. This should involve:
1. Data Feeds (Yahoo Finance): Integrate real-time market data feeds into
your trading system.
2. Order Execution: Develop a mechanism to execute trades based on the
signals generated by your trading strategies.
3. Performance Monitoring and its Analysis: Continuously monitor the
performance of your trading strategies in the live market, tracking key
metrics and risk-adjusted performance. For example,
1. Total return
2. Sharpe ratio
3. Drawdown
4. Win/loss ratio
4. Adaptation and Refinement: Discuss how you would adapt and refine
your trading strategies based on the insights gained from the real-time
live simulation.
* Students need to suggest their own business sector. No business sector should be
repeated.
Suggested Sections in the Report:
1. Abstract
2. Introduction and Background
3. Objectives
4. Literature Review (Optional)
5. Investment Portfolio
6. Trading Strategies ***
7. Backtesting and its analysis ***
8. Real-Time Live Simulation and its analysis ***
9. Comparison between Backtesting and the results of Real-Time Live Simulation
10. Discussion (Applications and Implications of Relationship found)
11. Limitations (Any issue related to the Big Data Science / Artificial Intelligence in
this study)
12. Conclusions
13. Recommendations
14. References (the supporting journal and /or conference papers for your findings with
references (pdf files))
15. Appendices ****
*** This section “Research Design and Methodology” should include the Big Data
Science / Technical Analysis / Artificial Intelligence methods and Python should be
used for programming.
**** Python code should be attached in the appendices.
Bonus:
Page 5
Bonus marks can be obtained as follows:
1. Except the requirements in Selection of Investment Portfolio (Section 2) in
p.3, one additional Investment Portfolio used. (5 marks each max 5 marks)
2. Except the requirements in trading strategy, one additional Artificial
Intelligence, Technical Analysis (TA), Econometrics, Portfolio Analysis, Risk
Analysis or another quantitative analysis method used not mentioned in this
subject with submission of python code, data and analysis results. However,
the bonus method cannot be the same as in other assignments of Big Data in
Finance. (5 marks each)
All bonus marks are justified in acceptance of above offers in accordance with the
quality of references and data. Maximum bonus marks = 20.
Requirements:
Students are required to present their topic (at least 10 mins per student) and to write
an article in English for English classes / Chinese for Chinese classes.
Submission:
Submit all files online with the following: (I:\Terence\ Big Data in Finance\...):
1. An article (at least 10 pages per 1 student, font 12, single line spacing – count
text, figures, tables only) – English for English classes or English in both
a. Word and
b. md (Obsidian) formats (using Word to md)
i. https://www.wordize.com/word-to-markdown/ or
ii. https://www.zamzar.com/convert/doc-to-md/ (Max 1 MB) or
iii. https://word2md.com/
iv. https://products.aspose.app/words/conversion/word-to-md
v. https://www.vertopal.com/en/convert/doc-to-md, copy the
output to notepad and save as md
2. A presentation file with speaking note and audio (please add the notes below
the powerpoint slides) (at least 5 mins per student) – English powerpoint 2019
or later (https://support.microsoft.com/en-us/office/record-a-slide-show-with narration-and-slide-timings-0b9502c6-5f6c-40ae-b1e7-e47d8741161c)
3. Python code in Python Format (py files)
a. 1 master py file for 1. Financial Instrument Selection and Portfolio
Optimization (Section 1 in p.1-2)
b. 1 master py file with all trading strategies, 3 py for 3 investment
portfolios (Section 2-3, 5 in p.3-4, 5)
c. 1 Backtest py file for 1 portfolio backtest, 3 Backtest py files for 3
portfolios backtests, (Section 2-3 in p.4-5)
4. Data Files in Excel / CSV Format (xlsx/CSV) with web address of data source
5. AI prompt for Python code generation (txt file)
6. Neural Network and Deep Learning Model Files
7. Analysis Result Files in Excel Format (xlsx)
8. All References (full text journal paper in pdf files)
9. Fill Online Questionnaire - https://wj.qq.com/s2/16787940/2748/
References:
1. https://www.youtube.com/watch?v=MikiBcP5uQQ&t=3s
2. Web of Science https://www.webofscience.com/wos/woscc/basic-search
3. Scopus https://www.scopus.com/
4. VOSviwer and Scopus https://www.youtube.com/watch?v=QcB9GTHEieY
Page 6
5. VOSviewer https://www.vosviewer.com/
6. Maxqda https://www.maxqda.com/
7. http://scholar.google.com/
8. http://ec.europa.eu/information_society/activities/egovernment_research/focus
/index_en.htm (eGovernment R&D focus)
9. http://library.ipm.edu.mo/Webpac/eresourcestore.asp?id=100 (ScienceDirect)
10. Other Journals and websites
Date of Submission:
Final Submission: 3 November for Monday Class & 4 November for Tuesday
Presentation started at the end of this subject (if necessary)
Group:
1 student in 1 group
Page 7
Appendix 1: Global Industry Classification Standard
10 Energy
1010 Energy
101010 Energy Equipment & Services
10101010 Oil & Gas Drilling
10101020 Oil & Gas Equipment & Services
101020 Oil, Gas & Consumable Fuels
10102010 Integrated Oil & Gas
10102020 Oil & Gas Exploration & Production
10102030 Oil & Gas Refining & Marketing
10102040 Oil & Gas Storage & Transportation
10102050 Coal & Consumable Fuel
15 Materials
1510 Materials
151010 Chemicals
15101010 Commodity Chemicals
15101020 Diversified Chemicals
15101030 Fertilizers & Agricultural Chemicals
15101040 Industrial Gases
15101050 Specialty Chemicals
151020 Construction Materials
15102010 Construction Materials
151030 Containers & Packaging
15103010 Metal & Glass Containers
15103020 Paper Packaging
151040 Metals & Mining
15104010 Aluminum
15104020 Diversified Metals & Mining
15104025 Copper
15104030 Gold
15104040 Precious Metals & Minerals
15104045 Silver
15104050 Steel
151050 Paper & Forest Products
15105010 Forest Products
15105020 Paper Products
Page 8
20 Industrials
2010 Capital Goods
201010 Aerospace & Defense
20101010 Aerospace & Defense
201020 Building Products
20102010 Building Products
201030 Construction & Engineering
20103010 Construction & Engineering
201040 Electrical Equipment
20104010 Electrical Components & Equipment
20104020 Heavy Electrical Equipment
201050 Industrial Conglomerates
20105010 Industrial Conglomerates
201060 Machinery
20106010 Construction Machinery & Heavy Trucks
20106015 Agricultural & Farm Machinery
20106020 Industrial Machinery
201070 Trading Companies & Distributors
20107010 Trading Companies & Distributors
2020 Commercial & Professional Services
202010 Commercial Services & Supplies
20201010 Commercial Printing
20201050 Environmental & Facilities Services
20201060 Office Services & Supplies
20201070 Diversified Support Services
20201080 Security & Alarm Services
202020 Professional Services
20202010 Human Resource & Employment Services
20202020 Research & Consulting Services
2030 Transportation
203010 Air Freight & Logistics
20301010 Air Freight & Logistics
203020 Airlines
20302010 Airlines
203030 Marine
20303010 Marine
203040 Road & Rail
20304010 Railroads
20304020 Trucking
203050 Transportation Infrastructure
20305010 Airport Services
20305020 Highways & Railtracks
20305030 Marine Ports & Services
25 Consumer Discretionary
2510 Automobiles & Components
Page 9
251010 Auto Components
25101010 Auto Parts & Equipment
25101020 Tires & Rubber
251020 Automobiles
25102010 Automobile Manufacturers
25102020 Motorcycle Manufacturers
2520 Consumer Durables & Apparel
252010 Household Durables
25201010 Consumer Electronics
25201020 Home Furnishings
25201030 Homebuilding
25201040 Household Appliances
25201050 Housewares & Specialties
252020 Leisure Products
25202010 Leisure Products
252030 Textiles, Apparel & Luxury Goods
25203010 Apparel, Accessories & Luxury Goods
25203020 Footwear
25203030 Textiles
2530 Consumer Services
253010 Hotels, Restaurants & Leisure
25301010 Casinos & Gaming
25301020 Hotels, Resorts & Cruise Lines
25301030 Leisure Facilities
25301040 Restaurants
253020 Diversified Consumer Services
25302010 Education Services
25302020 Specialized Consumer Services
2540 Media
254010 Media
25401010 Advertising
25401020 Broadcasting
25401025 Cable & Satellite
25401030 Movies & Entertainment
25401040 Publishing
Page 10
25 Consumer Discretionary (continued)
2550 Retailing
255010 Distributors
25501010 Distributors
255020 Internet & Direct Marketing Retail
25502020 Internet & Direct Marketing Retail
255030 Multiline Retail
25503010 Department Stores
25503020 General Merchandise Stores
255040 Specialty Retail
25504010 Apparel Retail
25504020 Computer & Electronics Retail
25504030 Home Improvement Retail
25504040 Specialty Stores
25504050 Automotive Retail
25504060 Home furnishing Retail
30 Consumer Staples
3010 Food & Staples Retailing
301010 Food & Staples Retailing
30101010 Drug Retail
30101020 Food Distributors
30101030 Food Retail
30101040 Hypermarkets & Super Centers
3020 Food, Beverage & Tobacco
302010 Beverages
30201010 Brewers
30201020 Distillers & Vintners
30201030 Soft Drinks
302020 Food Products
30202010 Agricultural Products
30202030 Packaged Foods & Meats
302030 Tobacco
30203010 Tobacco
3030 Household & Personal Products
303010 Household Products
30301010 Household Products
303020 Personal Products
30302010 Personal Products
Page 11
35 Health Care
3510 Health Care Equipment & Services
351010 Health Care Equipment & Supplies
35101010 Health Care Equipment
35101020 Health Care Supplies
351020 Health Care Providers & Services
35102010 Health Care Distributors
35102015 Health Care Services
35102020 Health Care Facilities
35102030 Managed Health Care
351030 Health Care Technology
35103010 Health Care Technology
3520 Pharmaceuticals, Biotechnology & Life Sciences
352010 Biotechnology
35201010 Biotechnology
352020 Pharmaceuticals
35202010 Pharmaceuticals
352030 Life Sciences Tools & Services
35203010 Life Sciences Tools & Services
40 Financials
4010 Banks
401010 Banks
40101010 Diversified Banks
40101015 Regional Banks
401020 Thrifts & Mortgage Finance
40102010 Thrift & Mortgage Finance
4020 Diversified Financials
402010 Diversified Financial Services
40201020 Other Diversified Financial Services
40201030 Multi-Sector Holdings
40201040 Specialized Finance
402020 Consumer Finance
40202010 Consumer Finance
402030 Capital Markets
40203010 Asset Management & Custody Banks
40203020 Investment Banking & Brokerage
40203030 Diversified Capital Markets
40203040 Financial Exchanges & Data
402040 Mortgage Real Estate Investment Trusts (REITs)
40204010 Mortgage REITs
Page 12
4030 Insurance
403010 Insurance
40301010 Insurance Brokers
40301020 Life & Health Insurance
40301030 Multi-line Insurance
40301040 Property & Casualty Insurance
40301050 Reinsurance
45 Information Technology
4510 Software & Services
451010 Internet Software & Services
45101010 Internet Software & Services
451020 IT Services
45102010 IT Consulting & Other Services
45102020 Data Processing & Outsourced Services
451030 Software
45103010 Application Software
45103020 Systems Software
45103030 Home Entertainment Software
4520 Technology Hardware & Equipment
452010 Communications Equipment
45201020 Communications Equipment
452020 Technology Hardware, Storage & Peripherals
45202030 Technology Hardware, Storage & Peripherals
452030 Electronic Equipment, Instruments & Components
45203010 Electronic Equipment & Instruments
45203015 Electronic Components
45203020 Electronic Manufacturing Services
45203030 Technology Distributors
4530 Semiconductors & Semiconductor Equipment
453010 Semiconductors & Semiconductor Equipment
45301010 Semiconductor Equipment
45301020 Semiconductors
50 Telecommunication Services
5010 Telecommunication Services
501010 Diversified Telecommunication Services
50101010 Alternative Carriers
50101020 Integrated Telecommunication Services
501020 Wireless Telecommunication Services
50102010 Wireless Telecommunication Services
Page 13
5 Utilities
5510 Utilities
551010 Electric Utilities
55101010 Electric Utilities
551020 Gas Utilities
55102010 Gas Utilities
551030 Multi-Utilities
55103010 Multi-Utilities
551040 Water Utilities
55104010 Water Utilities
551050 Independent Power and Renewable Electricity Producers
55105010 Independent Power Producers & Energy Traders
55105020 Renewable Electricity
60 Real Estate
6010 Real Estate
601010 Equity Real Estate Investment Trusts (REITs)
60101010 Diversified REITs
60101020 Industrial REITs
60101030 Hotel & Resort REITs
60101040 Office REITs
60101050 Health Care REITs
60101060 Residential REITs
60101070 Retail REITs
60101080 Specialized REITs
601020 Real Estate Management & Development
60102010 Diversified Real Estate Activities
60102020 Real Estate Operating Companies
60102030 Real Estate Development
60102040 Real Estate Services
请加QQ:99515681 邮箱:99515681@qq.com WX:codinghelp