Welcome to the algo trading learning repo¶

How to use this¶
What this is¶
- a short introduction to algo trading terminology, basic theory and data science
- explanation and equation of common technical indicators
- code implementation of technical & fundamental analysis strategies in Python
- code examples of using machine learning for stock screening and trend analysis
Why we built this¶
- There are lots of blog posts and online materials about algo trading, but they are scattered; and are usually just focusing on a few concepts or topics.
- There exist open-source repositories that feature the implementation of a wide selection of algo trading strategies, but they do not come along with explanations and assume that the user knows the financial logic behind.
- On the other hand, there exist good resources that provide detailed discussions about financial concepts and rationale of trading strategies Investopedia, but they are not accompanied by any code examples.
- Majority of the resources do not fit the local context (e.g. about trade execution in Hong Kong) and are merely focusing on the US market.
Installation¶
Introduction¶
Welcome to the first tutorial for algorithmic trading!
In this tutorial, you will learn:
- The basics of algorithmic trading
- Definition of technical analysis
- Definition of fundamental analysis
- Difference between technical analysis and fundamental analysis
- How machine learning could assist algo trading
What is algo trading?¶
Definition
A typical algorithmic trading system does the following things (in order):
- Inspect data, charts, quotes or news and generate trade signals as per your strategy
- Fill in order details when a trade signal is found
- Monitor and evaluate trades to see if they reached your target or went in the opposite direction
- Close positions to either book profits or cut losses
- Rinse and repeat
Automated Trading¶
High-freqeuncy Trading¶
Tip
- Algorithmic Trading: execution process based on an algorithm
- Automated Trading: as its name implies, automate the trading process
- High-frequency Trading: ultra-fast automated trading
What is technical analysis?¶
Definition
Rationale of technical analysis¶
1. Market Action Discounts Everything
2. Prices Move in Trends
3. History repeats itself
Pros and cons of technical analysis¶
Pros:
- Objectiveness
- Mathematical precision
- Emotional indifference
- Inexpensive
Cons:
- Self-fulfilling prophecy
- The future keeps running away
- Conflicting signals from different indicators
- Substantial movements might have taken place when a pattern is identified
What is fundamental analysis?¶
Definition
Rationale of fundamental analysis¶
- Economic analysis - focuses on analysing various macroeconomic factors such as interest rates, inflation, and GDP levels
- Industry analysis - focuses on assessing specific prospects and potential opportunities within the identified industries and sectors
- Company analysis - focuses on analysing and selecting individual stocks within the most promising industries
Pros and cons of fundamental analysis¶
Pros:
- Seeks to understand the value of an asset
- Long-term view
- Comprehensive
Cons:
- Time-consuming
- Results not suitable for quick decisions
- Does not provide info about entry points*
Technical analysis vs Fundamental analysis¶
Important

The use of machine learning¶
(To-be edited)
Macroeconomic data¶
(To-be edited)
Sentiment analysis¶
(To-be edited)
Conclusion¶
References
- Murphy, J. J. (1991). Technical analysis of the futures markets: A comprehensive guide to trading methods and applications. New York: New York Institute of Finance.
- CFI - Technical Analysis: A Beginner’s Guide
- IG - Technical Analysis definition
- FBS - Pros and Cons of Technical Analysis
- CFI - What is Fundamental Analysis?
Attention
Data science basics¶
In this tutorial, you will learn to:
- Conduct Exploratory Data Analysis (EDA)
- Carry out resampling
- Visualise time series data
- Calculate and plot distribution of percentage change
- Use moving windows
Requirements:
Exploratory Data Analysis¶
Definition
For example, in the process of EDA, aim to find the answer to these questions:
- What kind of data does the dataframe store?
- What is the range of each column?
- What is the data type of each column?
- Are there null values in the data?
# import AAPL csv file
aapl = pd.read_csv('../../database/nasdaq_ticks_day/nasdaq_AAPL.csv', header=0, index_col='Date', parse_dates=True)
# inspect first 5 rows (default)
aapl.head()
# inspect first 3 rows
aapl.head(3)
# check name of columns
aapl.columns
# output
# aapl.columns
# generate descriptive statistics, e.g. central tendency and dispersion of the dataset
# (excl. NaN values)
aapl.describe()
# prints a summary of the dataframe e.g. dtype, non-null values
aapl.info()
# print rows between two specific dates
print(aapl.loc[pd.Timestamp('2020-07-01'):pd.Timestamp('2020-07-17')])
# select only data between 2006 and 2019
aapl = aapl.loc[pd.Timestamp('2006-01-01'):pd.Timestamp('2019-12-31')]
aapl.head()
Resampling¶
It is easy to mix up sampling and resampling, which are indeed referring to two different concepts. We will first take a look at the definition of the former:
Definition
Assume that each row of data represents an observation about something in the world. When working with data, we usually do not have access to all possible observations since they might be hard to gather, or it might be to costly to process them altogether. Thus, we use sampling as a solution - to select some part of the population to observe, so that we can infer something about the whole population.
For example, we can conduct Simple Random Sampling, which means that each row (or observation) is drawn with a uniform probability from the dataset.
# Take 10 rows from the dataframe randomly
sample = aapl.sample(10)
print(sample)
Now that we understand what sampling is, let’s go back to resampling.
Definition
As a data sample might not accurately represent the popultion, it introduces the problems of Selection Bias and Sampling Error.
- Selection Bias occurs when the method of drawing observations skews the sample in some way.
- Sampling Error occurs when randomly drawing observations skews the sample in some way.
To address this problem, we want to know how accruately the data sample is estimating the population parameter (e.g. the mean, or the standard deviation).
If we only sample that data once, we will only have one single estimate of the population parameter, which makes it impossible to quantify the uncertainty of the estimate. (Assuming we do not have the population data) Therefore, we could try estimate the population parameter multiple times from our data sample - we call this action, resampling.
With regards to the resample function in pandas, it is used for changing the time interval of a dataset. Thus, we need a datetime type index or column in order to use the function.
# Resample to monthly level
monthly_aapl = aapl.resample('M').mean()
print(monthly_aapl)
As shown in the code above, there are two steps in calling the function:
- Pass the ‘Rule’ argument to the function, which determines by what interval the data will be resampled by. In the example above, ‘M’ means by month end frequency.
- Decide how to reduce the old datapoints or fill in the new ones, by calling groupby aggregate functions including mean(), min(), max(), sum().
In the above example, as we are resampling that data to a wider time frame (from days to months), we are actually “downsampling” the data.
On the other hand, if we resample the data to a shorter time frame (from days to minutes), it will be called “upsampling”:
# Resample to minutely level
minutely_aapl = aapl.resample('T').ffill()
print(minutely_aapl)
As we end up having additional empty rows in the resulting table, we need to decide how to fill in them with numeric values:
ffill()
‘Forward filling’ orpad()
‘padding’ — Use the last known value.bfill()
orbackfill()
‘Backfilling’ — Use the next known value.
Calculate percentage change¶
We can just directly use the pct_change()
function to do this.
daily_close = aapl[['Close']]
# Calculate daily returns
daily_pct_change = daily_close.pct_change()
# Replace NA values with 0
daily_pct_change.fillna(0, inplace=True)
# Inspect daily returns
print(daily_pct_change.head())
# Calculate daily log returns
daily_log_returns = np.log(daily_close.pct_change()+1)
# Print daily log returns
print(daily_log_returns.head())
We can also combine with the operation of resampling to get the percentage change of different time intervals.
# Resample to business months, take last observation as value
monthly = aapl.resample('BM').apply(lambda x: x[-1])
# Calculate monthly percentage change
monthly.pct_change().tail()
This example takes the mean instead of the last observation in each bin as the value.
# Resample to quarters, take the mean as value per quarter
quarter = aapl.resample("4M").mean()
# Calculate quarterly percentage change
quarter.pct_change().tail()
It is also good to learn how to manually do the calculation, without using the pct_change()
function.
# Daily returns
daily_pct_change = daily_close / daily_close.shift(1) - 1
# Print `daily_pct_change`
daily_pct_change.tail()
Visualise time series data¶
We will mainly use plotting functions provided by matplotlib. Line plot is the most common type of plot that we will use for analysis of stock data.
# Plot the closing prices for `aapl`
aapl['Close'].plot(grid=True)
# Show the line plot
plt.show()
Here is an example of plotting a histogram:
# Plot the distribution of `daily_pct_c`
daily_pct_change.hist(bins=50)
# Show the plot
plt.show()
# Pull up summary statistics
print(daily_pct_change.describe())
We can also create a new column to store the cumulative daily returns and plot the data in a graph.
# Calculate the cumulative daily returns
cum_daily_return = (1 + daily_pct_change).cumprod()
# Plot the cumulative daily returns
cum_daily_return.plot(figsize=(12,8))
# Show the plot
plt.show()
Moving windows¶
Moving windows (also called “rolling windows”) are snapshots of a portion of a time series at an instant in time. It is common to use the moving window in a trading strategy, for example to calculate a moving average.
# Isolate the closing prices
close_px = aapl['Close']
# Calculate the moving average
moving_avg = close_px.rolling(window=40).mean()
# Inspect the result
moving_avg.tail()
We can now easily plot the short-term and long-term moving averages:
# Short moving window rolling mean
aapl['42'] = close_px.rolling(window=40).mean()
# Long moving window rolling mean
aapl['252'] = close_px.rolling(window=252).mean()
# Plot the adjusted closing price, the short and long windows of rolling means
aapl[['Close', '42', '252']].plot()
# Show plot
plt.show()
Summary¶
- Exploratory Data Analysis (EDA)
head()
andtail()
- check first or last rowsdescribe()
- mean, sd, rangeinfo()
- dtype, non-null, count
- Resampling
df.resample('M').mean
- downsample to months (small to big, reduce values)df.resample('T').ffill()
- upsample to minutes (big to small, add values)
- Percentage change
col.pct_change()
- Visualise data
col.plot
- line plotcol.hist(bins=50)
- histogram
- Moving window
close_px.rolling(window=40).mean()
- moving average
References
- Towards Data Science - What is Exploratory Data Analysis?
- Jason Brownlee - A Gentle Introduction to Statistical Sampling and Resampling
- Towards Data Science - Using the Pandas “Resample” Function
- Algorithmic trading explained
- DataCamp - Python for Finance: Algorithmic Trading
Attention
Technical analysis¶
In this tutorial, you will learn:
- The basics of technical analysis
- Technical analysis charts
- What are the common technical indicators
- How to implement technical indicators
Intro to technical analysis¶
In general, technicians consider the following types of indicators:
- Price trends
- Chart analysis
- Volume indicators
- Momentum indicators
- Oscillators
- Moving averages
Requirements:
Chart analysis¶
Definition
Line chart¶
plt.style.use('ggplot')
# Initialise the plot figure
fig = plt.figure()
fig.set_size_inches(18.5, 10.5)
ax1 = plt.subplot2grid((6,1), (0,0), rowspan=5, colspan=1)
ax2 = plt.subplot2grid((6,1), (5,0), rowspan=1, colspan=1, sharex=ax1)
df['50ma'] = df['Close'].rolling(window=50, min_periods=0).mean()
df.dropna(inplace=True)
ax1.plot(df.index, df['Close'])
ax1.plot(df.index, df['50ma'])
ax2.bar(df.index, df['Volume'])
plt.show()

Example of a line chart and a bar chart showing price and volume changes respectively.
Candlesticks chart¶

Explanation of candlestick components. [1]
mpl_finance
to plot candlestick charts:fig = plt.figure()
fig.set_size_inches(18.5, 10.5)
ax1 = plt.subplot2grid((6,1), (0,0), rowspan=5, colspan=1)
ax2 = plt.subplot2grid((6,1), (5,0), rowspan=1, colspan=1, sharex=ax1)
# plot candlesticks
mpl_finance.candlestick_ohlc(ax1, data, width=0.7, colorup='g', colordown='r')
ax.grid() # show grids
############# x-axis locater settings #################
locator = mdates.AutoDateLocator() # interval automically set
ax1.xaxis.set_major_locator(locator) # as as interval in a-axis
ax1.xaxis.set_minor_locator(mdates.DayLocator())
############# x-axis locater settings #################
ax1.xaxis.set_major_formatter(mdates.AutoDateFormatter(locator)) # set x-axis label as date format
fig.autofmt_xdate() # rotate date labels on x-axis
pos = df['Open'] - df['Close'] < 0
neg = df['Open'] - df['Close'] > 0
ax2.bar(df.index[pos],df['Volume'][pos],color='green',width=1,align='center')
ax2.bar(df.index[neg],df['Volume'][neg],color='red',width=1,align='center')
plt.show()

Example of a candlestick chart.
Scaling¶
Arithmetic scaling¶
Key points
- On a linear scale, as the distance in the axis increases the corresponding value also increases linearly.
- When the values of data fluctuate between extremely small values and very large values – the linear scale will miss out the smaller values thus conveying a wrong picture of the underlying phenomenon.
Semi-logarithmic scaling¶
Key points
- On a logarithmic scale, as the distance in the axis increases the corresponding value increases exponentially.
- With logarithmic scale, both smaller valued data and bigger valued data can be captured in the plot more accurately to provide a holistic view.
Therefore, semi-logarithmic charts can be of immense help especially when plotting long-term charts, or when the price points show significant volatility even in short-term charts. The underlying chart patterns will be revealed more clearly in semi-logarithmic scale charts.
plt.style.use('ggplot')
fig, (ax1, ax2) = plt.subplots(1, 2)
fig.set_size_inches(18.5, 7.0)
### Subplot 1 - Semi-logarithmic ###
plt.subplot(121)
plt.grid(True, which="both")
# Linear X axis, Logarithmic Y axis
plt.semilogy(df.index, df['Close'], 'r')
plt.ylim([10,500])
plt.xlabel("Date")
plt.title('Semi-logarithmic scale')
fig.autofmt_xdate()
### Subplot 2 - Arithmetic ###
plt.subplot(122)
plt.plot(df.index, df['Close'], 'b')
plt.xlabel("Date")
plt.title('Arithmetic scale')
fig.autofmt_xdate()
# show plot
plt.show()

The same data plotted with semi-logarithmic and arithmetic scales.
Technical indicators¶
code/technical-analysis_python/
in
the repository.In general, there are 2 categories of indicators:
- Leading - They give trade signals when the trend is about to started, hence they use shorter periods in their calculations. Examples are MACD and RSI.
- Lagging - They follow the price action, and thus gives a signal after a trend or a reversal started. Examples are Moving Averages and Bollinger Bands.
Definition
# in terminal
cd code/technical-analysis_python
python main_macd_crossover.py # run macd in the backtester
Trend indicators¶
Definition
Moving Average Convergence Divergence (MACD)¶
One of the simplest strategy established with MACD, is to identify MACD crossovers. The rules are as follows.
Tip
- Buy signal: MACD rises above the signal line
- Sell signal: MACD falls below the signal line
It is easy to calculate the EMA with pandas:
# Get adjusted close column
close = self.df['Close']
exp1 = close.ewm(span=12, adjust=False).mean()
exp2 = close.ewm(span=26, adjust=False).mean()
df['MACD'] = exp1 - exp2
span
specifies the time span, and adjust=False
means the
exponentially weighted function is calculated recursively (as we do not need
a decaying adjustment factor for beginning periods).To plot the signal line:
df['Signal line'] = self.df['MACD'].ewm(span=9, adjust=False).mean()
Moving Averages (MA)¶
We could estalish a simple trading strategy making use of two moving averages:
Tip
- Buy signal: shorter-term MA crosses above the longer-term MA (golden cross)
- Sell signal: shorter-term MA crosses below the longer-term MA (dead/death cross)
Here is an example of how to plotting the two MAs:
# Create short simple moving average over the short window
signals['short_mavg'] = self.df['Close'].rolling(window=short_window, min_periods=1, center=False).mean()
# Create long simple moving average over the long window
signals['long_mavg'] = self.df['Close'].rolling(window=long_window, min_periods=1, center=False).mean()
short_window
and long_window
on our own, for example as setting
short_window = 40
and long_window = 40
.And then we could generate signals based on the two line plots:
# Generate signals
signals['signal'][short_window:] = np.where(signals['short_mavg'][short_window:]
> signals['long_mavg'][self.short_window:], 1.0, 0.0)
signals['positions'] = signals['signal'].diff()
Parabolic Stop and Reverse (Parabolic SAR)¶
(i) Rising SAR (Uptrend)
- Prior SAR: The SAR value for previous period.
- Extreme Point (EP): The highest high of the current uptrend.
- Acceleration Factor (AF): Starting at 0.02, increases by 0.02 each time the extreme point makes a new high. AF can only reach a maximum of 0.2, no matter how long the uptrend extends.
(ii) Falling SAR (Downtrend)
- Prior SAR: The SAR value for previous period.
- Extreme Point (EP): The lowest low of the current downtrend.
- Acceleration Factor (AF): Starting at 0.02, increases by 0.02 each time the extreme point makes a new low. AF can only reach a maximum of 0.2, no matter how long the downtrend extends.
We generate signals based on the rising and falling SARs.
Tip
- Buy signal: if falling SAR goes below the price
- Sell signal: if rising SAR goes above the price
array_high = list(df['High'])
array_low = list(df['Low'])
array_close = list(df['Close'])
psar = df['Close'].copy()
psarbull = [None] * len(df)
psarbear = [None] * len(df)
bull = True # flag to indicate saving value for rising SAR
af = initial_af # initialise acceleration factor
max_af = 0.2
ep = array_low[0] # extreme price
hp = array_high[0] # extreme high
lp = array_low[0] # extreme low
Then, traversing each row in the dataframe, we could calculate rising SAR and falling SAR at the same time:
for i in range(2, len(self.df)):
if bull:
# Rising SAR
psar[i] = psar[i-1] + af * (hp - psar[i-1])
else:
# Falling SAR
psar[i] = psar[i-1] + af * (lp - psar[i-1])
reverse = False
# Check reversion point
if bull:
if array_low[i] < psar[i]:
bull = False
reverse = True
psar[i] = hp
lp = array_low[i]
af = initial_af
else:
if array_high[i] > psar[i]:
bull = True
reverse = True
psar[i] = lp
hp = array_high[i]
af = initial_af
if not reverse:
if bull:
# Extreme high makes a new high
if array_high[i] > hp:
hp = array_high[i]
af = min(af + initial_af, max_af)
# Check if SAR goes abov prior two periods' lows.
# If so, use the lowest of the two for SAR.
if array_low[i-1] < psar[i]:
psar[i] = array_low[i-1]
if array_low[i-2] < psar[i]:
psar[i] = array_low[i-2]
else:
# Extreme low makes a new low
if array_low[i] < lp:
lp = array_low[i]
af = min(af + initial_af, max_af)
# Check if SAR goes below prior two periods' highs.
# If so, use the highest of the two for SAR.
if array_high[i-1] > psar[i]:
psar[i] = array_high[i-1]
if array_high[i-2] > psar[i]:
psar[i] = array_high[i-2]
# Save rising SAR
if bull:
psarbull[i] = psar[i]
# Save falling SAR
else:
psarbear[i] = psar[i]
Momentum indicators¶
Commodity Channel Index (CCI)¶
The formula for calculating CCI is given as follow.
- Typical Price (TP) = (High + Low + Close) / 3
- Constant = 0.015
- x = Window size (default set as 20)
- SMA: Simple Moving Average
signals['Typical price'] = (df['High'] + df['Low'] + df['Close']) / 3
signals['SMA'] = signals['Typical price'].rolling(
window=self.window_size, min_periods=1, center=False).mean()
signals['mean_deviation'] = signals['Typical price'].rolling(
window=20, min_periods=1, center=False).std()
signals['CCI'] = (signals['Typical price'] - signals['SMA']) /
(self.constant * signals['mean_deviation'])
A simple strategy formulated by using CCI is (the thresholds only serve as examples:
Tip
- Buy signal: when CCI surges above +100
- Sell signal: when CCI plunges below -100
# Generate buy signal
signals.loc[signals['CCI'] > 100, 'signal'] = 1.0
# Generate sell signal
signals.loc[signals['CCI'] < -100, 'signal'] = -1.0
Relative Strength Index (RSI)¶
where Average Gain and Average Loss are calculated as follows:
In the dataset, we need to extract gains and losses from the price column respectively:
# Get adjusted close column
close = df['Close']
# Get the difference in price from previous step
delta = close.diff()
# Get rid of the first row
delta = delta[1:]
# Make the positive gains (up) and negative gains (down) series
up, down = delta.copy(), delta.copy()
up[up < 0] = 0
down[down > 0] = 0
To calculate RS, as well as RSI:
# Calculate SMA using 'rolling' function
roll_up = up.rolling(window_size).mean()
roll_down = down.abs().rolling(window_size).mean()
# Calculate RSI based on SMA
RS = roll_up / roll_down
RSI = 100.0 - (100.0 / (1.0 + RS))
Tip
- Oversold: when RSI crosses the lower threshold (e.g. 30)
- Overbought: when RSI crosses the upper threshold (e.g. 70)
Rate of Change (ROC)¶
As you could see from above, it’s just the simple percentage change formula.
We could identify overbought and oversold conditions using ROC:
Tip
- Oversold: when ROC crosses the lower threshold (e.g. -30)
- Overbought: when ROC crosses the upper threshold (e.g. +30)
And here is one of the possible ways to calculate ROC:
n = 12 # set time period
diff = df['Close'].diff(n - 1)
# Calculate closing price n periods ago
closing = self.df['Close'].shift(n - 1)
df['ROC'] = (diff / closing) * 100
Stochastic Oscillator (STC)¶
- Lowest Low = lowest low for the look-back period
- Highest High = highest high for the look-back period
Note that in the formula %K is multiplied by 100 so as to move the decimal point by two places.
array_highest = [0] * length # store highest highs
for i in range(k - 1, length):
highest = array_high[i]
for j in range(i - 13, i + 1): # k-day lookback period
if array_high[j] > highest:
highest = array_high[j]
array_highest[i] = highest
array_lowest = [0] * length # store lowest lows
for i in range(k - 1, length):
lowest = array_low[i]
for j in range(i - 13, i + 1): # k-day lookback period
if array_low[j] < lowest:
lowest = array_low[j]
array_lowest[i] = lowest
# find %K line values
kvalues = [0] * length
for i in range(self.k - 1, length):
k = ((array_close[i] - array_lowest[i]) * 100) / (array_highest[i] - array_lowest[i])
kvalues[i] = k
df['%K'] = kvalues
# find %D line values
df['%D'] = df['%K'].rolling(window=3, min_periods=1, center=False).mean()
Tip
- Buy signal: when %K line crosses above the %D line
- Sell signal: when %K line crosses below the %D line
True Strength Index (TSI)¶
(i) Double Smoothed Price Change (PC)
- PC = Current Price - Prior Price
- First Smoothing = 25-period EMA of PC
- Second Smoothing = 13-period EMA of 25-period EMA of PC
(ii) Double Smoothed Absolute Price Change (PC)
- Absolute Price Change | PC | = Absolute Value of Current Price minus Prior Price
- First Smoothing = 25-period EMA of | PC |
- Second Smoothing = 13-period EMA of 25-period EMA of | PC |
Based on the above formulae, the code is shown as follow:
df['Double Smoothed PC'] = pc.ewm(span=25, adjust=False).mean().ewm(
span=13, adjust=False).mean()
df['Double Smoothed Abs PC'] = abs(pc).ewm(span=25, adjust=False).mean().ewm(
span=13, adjust=False).mean()
df['TSI'] = df['Double Smoothed PC'] / df['Double Smoothed Abs PC'] * 100
In order to interpret the TSI, we could define a signal line:
And we could observe signal line crossovers:
Tip
- Buy signal: when TSI crosses above the signal line from below
- Sell signal: when TSI crosses below the signal line from above
Money Flow Index (MFI)¶
It is pretty straightforward to calculate typical price:
# Typical price
tp = (df['High'] + df['Low'] + df['Close']) / 3.0
# positive = 1, negative = -1
self.df['Sign'] = np.where(tp > tp.shift(1), 1, np.where(tp < tp.shift(1), -1, 0))
# Raw money flow
df['Money flow'] = tp * df['Volume'] * df['Sign']
# Positive money flow with n periods
n_positive_mf = df['Money flow'].rolling(n).apply
(lambda x: np.sum(np.where(x >= 0.0, x, 0.0)), raw=True)
# Negative money flow with n periods
n_negative_mf = abs(df['Money flow'].rolling(self.n).apply
(lambda x: np.sum(np.where(x < 0.0, x, 0.0)), raw=True))
With the money flows, it would be easy to compute the MFI:
mf_ratio = n_positive_mf / n_negative_mf
df['MFI'] = (100 - (100 / (1 + mf_ratio)))
By way of example, we could use MFI to identify overbought and oversold conditions:
Tip
- Oversold: when MFI crosses the upper threshold
- Overbought: when MFI crosses the lower threshold
William %R¶
- Lowest Low = lowest low for the look-back period
- Highest High = highest high for the look-back period
The code for implementing %R is shown as follows:
lbp = 14 # set lookback period
hh = df['High'].rolling(lbp).max() # highest high over lookback period
ll = df['Low'].rolling(lbp).min() # lowest low over lookback period
df['%R'] = -100 * (hh - df['Close']) / (hh - ll)
Similarly, we could use %R to identify overbought and oversold conditions:
Tip
- Oversold: when %R goes below -80
- Overbought: when %R goes above -20
Volatility indicators¶
Bollinger Bands¶
window = 20
# Compute middle band
df['Middle band'] = self.df['Close'].rolling(window).mean()
# Compute 20-day s.d.
mstd = df['Close'].rolling(window).std(ddof=0)
# Computer upper and lower bands
df['Upper band'] = df['Middle band'] + mstd * 2
df['Lower band'] = df['Middle band'] - mstd * 2
Tip
- Buy signal: when price goes below lower band
- Sell signal: when price goes above upper band
Average True Range¶
- 1st True Range (TR) value = High - Low
- 1st n-day ATR = average of the daily TR values for the last n days
array_high = list(df['High'])
array_low = list(df['Low'])
tr = [None] * len(df) # initialisation
for i in range(len(df)):
tr[i] = array_high[i] - array_low[i]
atr = [None] * len(self.df) # initialisation
window = 14
atr[15] = sum(tr[0:15]) / window
for i in range(16,len(self.df)):
atr[i] = (atr[i-1] * (window-1) + tr[i]) / window
Tip
We could use ATR to filter out stocks that are highly volatile.
Standard Deviation¶
As an example, we could set window=21
:
window = 21
df['SD'] = df['Close'].rolling(window).std(ddof=0)
Tip
We could use Standard Deviation to measure the expected risk of stocks.
Volume indicators¶
Chaikin Oscillator¶
df['MFM'] = ((df['Close'] - df['Low']) - df['High'] - df['Close'])
/ (df['High'] - df['Low'])
df['MFV'] = df['MFM'] * df['Volume']
df['ADL'] = df['Close'].shift(1) + df['MFV']
short_w = 3
long_w = 10
ema_long = df['ADL'].ewm(ignore_na=False, min_periods=0, com=short_w, adjust=True).mean()
ema_short = df['ADL'].ewm(ignore_na=False, min_periods=0, com=long_w, adjust=True).mean()
df['Chaikin'] = ema_short - ema_long
Tip
- Buy signal: when the oscillator is positive
- Sell signal: when the oscillator is negative
On-Balance Volume (OBV)¶
The formula for OBC changes according to the following 3 cases:
1) If closing price > prior close price:
2) If closing price < prior close price:
3) If closing price = prior close price then:
We could traverse the dataframe, and use if-else statements to capture the 3 conditions:
obv = [0] * len(self.df) # for storing the on-balance volume
array_close = list(df['Close'])
array_volume = list(df['Volume'])
for i in range(1, len(self.df)):
if (array_close[i] > array_close[i-1]):
obv[i] = obv[i-1] + array_volume[i]
elif (array_close[i] < array_close[i-1]):
obv[i] = obv[i-1] - array_volume[i]
else:
obv[i] = obv[i-1]
Tip
- A rising OBV reflects positive volume pressure that can lead to higher prices
- A falling OBV reflects negative volume pressure that can foreshadow lower prices
Volume Rate of Change¶
The way of calculating Volume ROC is similar to ROC:
n = 25 # example time period
df['Volume ROC'] = ((df['Close'] - df['Close'].shift(n)) /
df['Close'].shift(n))
Here is a simple example strategy based on Volume ROC:
Tip
- Buy signal: if Volume ROC goes below zero
- Sell signal: if Volume ROC is negative
References
Image sources
[1] | By Probe-meteo.com - Probe-meteo.com, CC BY-SA 3.0, https://commons.wikimedia.org/w/index.php?curid=26048221 |
Attention
Fundamental analysis¶
In this tutorial, you will learn:
- The basics of fundamental analysis
- What are financial ratios
- How to carry out stock screening
Intro to fundamental analysis¶
- Quantitative aspect concerns about studying numerical figures like the company’s revenues, profits, assets, debts, etc
- Qualitative aspect focuses on intangible factors like the company’s management background and brand recognition.
Financial ratios¶
- Price-earnings (P/E) ratio
- Earnings per share (EPS) ratio
- Debt-to-equity ratio
- Return on equity (ROE) ratio
Short-term solvency ratios¶
Definition
Current ratio¶
Quick ratio¶
Cash ratio¶
Networking capital to current liabilities¶
text{Networking capital to current liabilities} = frac{text{Current assets} + text{Current liabilities}}{text{Current liabilities}}
Turnover ratios¶
Definition
Inventory turnover ratios¶
Receivable turnover¶
Fixed asset turnover¶
Total asset turnover¶
Financial leverage ratios¶
Definition
Total debt ratio¶
Debt to equity ratio¶
Equity ratio¶
Long-term debt ratio¶
Times interest earned ratio¶
Profitability ratios¶
Definition
Gross profit margin¶
Net profit margin¶
Return on assets (ROA)¶
Return on equity (ROE)¶
Ratio analysis for stock screening¶
# take equity ratio as an example
equity_ratio = df["Total stockholders\' equity"].astype(int)
/ df['Total Assets'].astype(int) # Total stockholders' equity / Total Assets
# store all the ratios in the `ratios` dataframe
ratios["Equity ratio"] = equity_ratio
# e.g. filter out stocks with profitability ratio greater than overall mean
mask1 = (ratios['EPS'] > ratios['EPS'].mean())
mask2 = (ratios['ROE'] > ratios['ROE'].mean())
mask3 = (ratios['ROA'] > ratios['ROA'].mean())
# apply the masks
ratios[(mask1) & (mask2) & (mask3)]
References
Attention
Evaluation metrics¶
code/evalaute.py
in the repository.Portfolio return¶
Definition
Sharpe ratio¶
Definition
Mathematically, it is the average return earned in excess of the risk-free rate per unit of total volatility.
Maximum drawdown (MDD)¶
Definition
Compound Annual Growth Rate (CAGR)¶
Definition
Standard Deviation¶
Definition :class: myOwnStyle
Attention
Introduction to Julia¶
Julia is a high-level, high-performance programming language that is designed for numerical analysis and computational science. Nature describes it as the “best of both worlds” - a language that combines the interactivity and syntax of ‘scripting’ languages, such as Python, Matlab and R, with the speed of ‘compiled’ languages such as Fortran and C.
Why use Julia¶
Julia is designed for speed, performance and scalability. Here are some highlights about the best features of Julia:
- It has a legible syntax and is easy to learn.
- It incorporates vector notations and DataFrames as part of the language.
- It compiles codes in advance and thus is designed to be fast.
Many financial firms including BlackRock and State Street are using Julia as their primary development language. With the belief that the community for Julia will continue to grow, here we would like to introduce some examples of using Julia to write algorithmic trading strategies.
Getting started¶
The official repo and installation instructions could be found here.
Here are some good tutorials for picking up the language:
References
Attention
Data science basics¶
In this tutorial, you will learn to do the following in Julia:
- Load and output csv files
- Manipulate DataFrames
You could run the code for this tutorial in code/technical-analysis_julia/data-science-basics.ipynb
.
Make sure you have installed Julia and all the required dependencies (follow instructions here).
You would need to install and import the following libraries:
# import libraries
using CSV;
using Dates;
using DataFrames;
using Statistics;
using Plots;
using StatsPlots;
using RollingFunctions;
Read and output csv file¶
The CSV package is useful for loading and manipulating dataframes.
# load data
df = CSV.File("../../database/hkex_ticks_day/hkex_0001.csv") |> DataFrame
# save as csv
CSV.write("test.csv", df)
Data inspection¶
The first
and last
functions are similar to head
and tail
in pandas. Additionally, we also have describe
that returns a summary of the dataframe.
first(df, 5) # show first 5 rows
last(df, 5) # last 5 rows
describe(df) # get summary of df
We can get the column names by:
names(df) # column names
Data selection¶
As we always select rows within a particular date range for stock price data, here is how to do it:
df[(df.Date .> Date(2017, 1)) .& (df.Date .< Date(2019, 1)), :]
Alteratively, we could generate a list of dates and check if date is in this range:
dates = [Date(2017, 1),Date(2018)];
yms = [yearmonth(d) for d in dates];
print(yms) # [(2017, 1), (2018, 1)]
df[in(yms).(yearmonth.(df.Date)), :], 10
We can select column(s) by the following way:
close = select(df, :Close) # select column "Close"
close = select(df, [:Close, :Volume]) # select columns "Close", "Volume"
References
- Towards Data Science - What is Exploratory Data Analysis?
- Jason Brownlee - A Gentle Introduction to Statistical Sampling and Resampling
- Towards Data Science - Using the Pandas “Resample” Function
- Algorithmic trading explained
- DataCamp - Python for Finance: Algorithmic Trading
Attention
Write a technical strategy¶
In this tutorial, you will learn to do the following in Julia:
- Write a trading strategy
- Backtest the strategy
- Plot the trading signals
- Evalaute strategy performance
You could run the code for this tutorial in code/technical-analysis_julia/moving_average.jl
or code/technical-analysis_julia/moving-average.ipynb
.
You would need to install and import the following libraries:
# import libraries
using CSV;
using Dates;
using DataFrames;
using Statistics;
using Plots;
using PyCall;
using RollingFunctions;
@pyimport matplotlib.pyplot as plt
First, load the file from the sample database:
# load data
df = CSV.File("../../database/microeconomic_data/hkex_ticks_day/hkex_0001.csv") |> DataFrame
ticker = "0001.HK" # set this variable for plot title
Build the strategy¶
We will use Moving Average (MA) as the example here. You could refer to the section Moving Averages (MA) to read the formulae and conditions for generating buy/sell signals.
Calculate the short and long moving averages,
and respectively append them to the signals
dataframe:
# initialise signals dataframe
signals = df[:,[:Date, :Close]]
dates = Array(convert(Matrix, select(df, :Date))) # get dates
close = convert(Matrix, select(df, :Close)) # get closing price
# short MA
short_window = 40
short_mavg = runmean(vec(close), short_window)
insertcols!(signals, 1, :short_mavg => short_mavg)
# long MA
long_window = 100
long_mavg = runmean(vec(close), long_window)
insertcols!(signals, 1, :long_mavg => long_mavg)
Generate the buy and sell signals based on the short_mavg
and long_mavg
colums:
# Create signals
signal = Float64[]
for i in 1:length(short_mavg)
if short_mavg[i] > long_mavg[i]
x = 1.0 # buy signal
else
x = 0.0
end
push!(signal, x)
end
insertcols!(signals, 1, :signal => signal)
Generate the positions by taking the row differences in signal
:
# Generate positions
function gen_pos(signal)
positions = zeros(length(signal))
positions[1] = 0
for i in 2:length(signal)
positions[i] = signal[i] - signal[i-1]
end
return positions
end
positions = gen_pos(signal)
insertcols!(signals, 1, :positions => positions)
We also generate temporary arrays for plotting buy and sell signals respectively:
# Generate tmp arrays to plot buy signals
buy_signals = DataFrame()
buy_dates = []
buy_prices = []
for i in 1:length(positions)
if (positions[i] == 1.0)
push!(buy_dates, dates[i])
push!(buy_prices, close[i])
end
end
insertcols!(buy_signals, 1, :Date => buy_dates)
insertcols!(buy_signals, 1, :Price => buy_prices)
#print(first(buy_signals,10))
# Generate tmp arrays to plot sell signals
sell_signals = DataFrame()
sell_dates = []
sell_prices = []
for i in 1:length(positions)
if (positions[i] == -1.0)
push!(sell_dates, dates[i])
push!(sell_prices, close[i])
end
end
insertcols!(sell_signals, 1, :Date => sell_dates)
insertcols!(sell_signals, 1, :Price => sell_prices)
Plotting graphs¶
As we make use of matplotlib
to plot graphs, the functions are very similar to those
we have used in Python.
fig = plt.figure() # Initialise the plot figure
ax1 = fig.add_subplot(111, ylabel="Price in \$") # Add a subplot and label for y-axis
# plot moving averages as line
plt.plot(signals.Date, signals.short_mavg, color="blue", linewidth=1.0, label="Short MA")
plt.plot(signals.Date, signals.long_mavg, color="orange", linewidth=1.0, label="Long MA")
# plot signals with colour markers
plt.plot(buy_signals.Date, buy_signals.Price, marker=10, markersize=7, color="m", linestyle="None", label="Buy signal")
plt.plot(sell_signals.Date, sell_signals.Price, marker=11, markersize=7, color="k", linestyle="None", label="Sell signal")
plt.title("MA crossover signals")
plt.show()
# save fig
fig.savefig("./figures/moving-average-crossover_signals", dpi=100)
Backtesting¶
We could then backtest the strategy on the historical price data:
initial_capital = 100000.0
# Initialise the portfolio with value owned
portfolio = signals[:,[:Date, :Close, :positions]]
portfolio[:trade] = signals[:Close] .* (100 .* signals[:positions])
# Add `holdings` to portfolio
portfolio[:quantity] = cumsum(100 .* signals[:positions])
portfolio[:holdings] = portfolio[:Close] .* portfolio[:quantity]
# Add `cash` to portfolio
portfolio[:cash] = initial_capital .- cumsum(portfolio[:trade])
# Add `total` to portfolio
portfolio[:total] = portfolio[:cash] .+ portfolio[:holdings]
portfolio_total = Array(portfolio[:total])
# Generate returns
function gen_returns(portfolio_total)
returns = zeros(length(portfolio_total))
returns[1] = 0
for i in 2:length(portfolio_total)
returns[i] = (portfolio_total[i] - portfolio_total[i-1]) / portfolio_total[i-1]
end
return returns
end
returns = gen_returns(portfolio_total)
insertcols!(portfolio, 1, :returns => returns)
# Print final portfolio value and total return in terminal
@printf("Final total value: %f\n", portfolio.total[size(portfolio,1)])
total_return = (portfolio.total[size(portfolio,1)] - portfolio.total[1]) / portfolio.total[1]
@printf("Total return: %f\n", total_return)
Strategy evaluation¶
Note that all the evalation metric functions are designed to take the portfolio dataframe as argument. You could refer to the Evaluation metrics section the mathematical formulae for each evaluation metric.
Portfolio return¶
function portfolio_return(portfolio)
fig = plt.figure() # Initialise the plot figure
ax1 = fig.add_subplot(111, ylabel="Total in \$") # Add a subplot and label for y-axis
plt.plot(portfolio.Date, portfolio.returns, color="blue", linewidth=1.0, label="Returns")
plt.title("MA crossover portfolio return")
plt.show()
# save fig
fig.savefig("./figures/moving-average-crossover_returns", dpi=100)
end
# call function
portfolio_return(portfolio)
Sharpe ratio¶
function sharpe_ratio(portfolio)
# Annualised Sharpe ratio
sharpe_ratio = sqrt(252) * (mean(returns) / std(returns))
return sharpe_ratio
end
# Call function and print output
sharpe = sharpe_ratio(portfolio)
@printf("Sharpe ratio: %f\n", sharpe)
Compound Annual Growth Rate (CAGR)¶
function CAGR(portfolio)
# Get the number of days in df
format = DateFormat("y-m-d")
days = portfolio.Date[size(portfolio,1)] - portfolio.Date[1]
# Calculate the CAGR
cagr = ^((portfolio.total[size(portfolio,1)] / portfolio.total[1]), (252.0 / Dates.value(days))) - 1
return cagr
end
# Call function and print output
cagr = CAGR(portfolio)
@printf("CAGR: %f\n", cagr)
Attention
Bankruptcy prediction¶
In this tutorial, you will learn to:
- Train a machine learning model for bankruptcy prediction
- Carry out inferencing
- Screen out stocks based on the results
Intro to bankruptcy prediction¶
Bankruptcy prediction with machine learning¶
- Collect labelled data (bankrupted = 0, survived = 1)
- Split the dataset into training set and test set a ratio of 7:3
- Train the machine learning model(s) with the labelled data (training set)
- Evaluate the accuracy of the training model with the test set
- Make use of the models to predict our own set of data (i.e. inferencing)
Data collection¶
t1
means 1 year, t2
means 2 years, and so on.# Concatenate data
data_full = pd.concat([bankrupt_data, non_bankrupt_data], ignore_index=True)
# Add and scale variables
data_full["X1"] = preprocessing.scale(data_full["WoCap"] / data_full["ToAsset"])
data_full["X2"] = preprocessing.scale(data_full["CFOper"] / data_full["ToLia"])
data_full["X3"] = preprocessing.scale(data_full["EBIT"] / data_full["ToAsset"])
data_full["X4"] = preprocessing.scale(data_full["ToEqui"] / data_full["ToAsset"])
data_full["X5"] = preprocessing.scale(data_full["NetInc"] / data_full["ToAsset"])
data_full["X6"] = preprocessing.scale(data_full["ToLia"] / data_full["ToAsset"])
Train-test split¶
# Split data for training and testing
X = data_full[["X1", "X2", "X3", "X4","X5","X6"]]
y = data_full['Status']
self.X_train, self.X_test, self.y_train, self.y_test = train_test_split(X, y, test_size=0.3, random_state=101)
- Support Vector Machine
- Decision Tree
- Random Forest
- K-Nearest Neighbor (KNN)
Training the model¶
# after loading and pre-processing the data
# create an instance of the class
model = bankrupt_prediction(bankrupt_t1, non_bankrupt_t1, input_df, False)
# train with knn
t1_results = model.knn()
Evaluate accuracy¶
# ouput:
Confusion Matrix using Decision Tree:
[[7 2]
[2 9]]
Classification Report using Decision Tree:
precision recall f1-score support
0 0.78 0.78 0.78 9
1 0.82 0.82 0.82 11
accuracy 0.80 20
macro avg 0.80 0.80 0.80 20
weighted avg 0.80 0.80 0.80 20
Inferencing¶
inferencing
to True
and call the method again.model.inferencing = True # set Inferencing as True
t1_results = model.knn()
# save results to a csv file
t1_results.to_csv("./results/t1_results-knn.csv", header='column_names', index=False)
1
.df_t1 = pd.read_csv("./results/t1_results-knn.csv", index_col=None, header=0)
mask = (df_t1['knn'] == 1)
# get the percentage of survival
len(df_t1[mask]) / len(df_t1)
df_filtered = df_t1[mask]
df_filtered.to_csv("survived-t1.csv", header='column_names', index=False)
Sources
[1] | SAF2002, https://ja.wikipedia.org/wiki/SAF2002 |
Attention
Property price prediction¶
In this tutorial, you will learn:
- The basics in macroeconomic analysis
- The ways of analyzing macroeconomic indicators
- The ways of analyzing real estate market data
- How to build a property price prediction model
Intro to macroeconomic analysis¶
Macroeconomic indicators in Hong Kong¶
import random
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelEncoder
df.info()
to print information of all columns.
The column information of macroeconomic data.
Univariate analysis¶
pandas.Dataframe.describe()
to examine the
distribution of the numerical features. It returns the statistical summary such as mean,
standard deviation, min, and max of a data frame.seaborn.distplot()
to visualise the results with histograms.# Statistical summary
print(df[feature_name].describe())
# Histogram
plt.figure(figsize=(8,4))
sns.distplot(df[feature_name], axlabel=feature_name);
Bivariate analysis¶
matplotlib.pyplot.scatter()
and seaborn.regplot()
to
visualize the relationship between two features.x = df[feature_name]
y = df['hsi']
plt.scatter(x, y)
plt.xticks(rotation=45)
fig = sns.regplot(x=feature_name, y="hsi", data=df)

An example of a scatter plot with a regression line.
pandas.Dataframe.corr()
and seaborn.heatmap()
to compute
a pairwise correlation of features and visualize the correlation matrix.fig, ax = plt.subplots(figsize=(10,10))
cols = df.corr().sort_values('hsi', ascending=False).index
cm = np.corrcoef(df[cols].values.T)
hm = sns.heatmap(cm, annot=True, square=True, annot_kws={'size':11}, yticklabels=cols.values, xticklabels=cols.values)
plt.show()

Heatmap - macroeconomic indicators of Hong Kong.
The Hong Kong real estate market¶
Data pre-processing¶
1. Derive some useful features from existing features.
# Add new features
df['month'] = pd.to_datetime(df['RegDate']).dt.month
df['year'] = pd.to_datetime(df['RegDate']).dt.year
2. Drop unmeaningful features and features with too many missing values
# Drop unnecessary columns
df = df.drop([feature_name], axis=1)
3. Handle missing values by replacing NAN with a mean value of a feature
# Handling missinig values
# Fill with mean
feature_name_mean = df[feature_name].mean()
df[feature_name] = df[feature_name].fillna(feature_name_mean)
- Label encode categorical features
le = LabelEncoder()
le.fit(list(processed_df[feature_name].values))
processed_df[feature_name] = le.transform(list(processed_df[feature_name].values))
Economic indicator analysis¶

The data structure of transaction record (Centaline Property).
# calculate the monthly average house price
df = df.groupby(['year','month'],as_index=False).mean()
df = df.rename(columns={'UnitPricePerSaleableArea': 'AveragePricePerSaleableArea'})

Heatmap - economic indicators analysis.
Transaction record analysis¶

The data structure of transaction record (Midland Realty) - Part 1.

The data structure of transaction record (Midland Realty) - Part 2.
# Distribution
print(df['price'].describe())
# Skewness and kurtosis
print("Skewness: ", df['price'].skew())
print("Kurtosis: ", df['price'].kurt())
#output:
count 1.664090e+05
mean 9.133268e+06
std 1.310856e+07
min 5.500000e+05
25% 5.200000e+06
50% 6.830000e+06
75% 9.500000e+06
max 1.399000e+09
Name: price, dtype: float64
Skewness: 26.927207752922435
Kurtosis: 1526.4066673335874
# Calculate mean and standard deviation
data_mean, data_std = np.mean(df[feature_name]), np.std(df[feature_name])
# Calculate upper boundary
upper = data_mean + data_std * 3
# Remove outliers
df = df[df[feature_name] < upper]

Heatmap - transaction data analysis.
code/macroeconomic-analysis/
in the repository.Property price prediction with machine learning¶
Train-test split¶
sklearn.model_selection.train_test_split()
to split the data with the ratio
of 8:2. The input variables are the top 7 features selected from the analysis, and
the output feature is the house price.feat_col = [ c for c in df.columns if c not in ['price'] ]
x_df, y_df = df[feat_col], df['price']
x_train, x_test, y_train, y_test = train_test_split(x_df, y_df, test_size=0.2, random_state=RAND_SEED)
Log transformation¶
y_train
using log function to normalise the highly
skewed price data. In this way, the dynamic range of Hong Kong’s property price can be reduced.log_y_train= np.log1p(y_train)
Training the model¶
- XGBoost
- Lasso
- Random Forest
- Linear Regression
x_train
and y_train
, and use the
models to make the predictions.import xgboost as xgb
# XGBoost
model_xgb = xgb.XGBRegressor(objective ='reg:squarederror',
learning_rate = 0.1, max_depth = 5, alpha = 10,
random_state=RAND_SEED, n_estimators = 1000)
model_xgb.fit(x_train, log_y_train)
xgb_train_pred = np.expm1(model_xgb.predict(x_train))
xgb_test_pred = np.expm1(model_xgb.predict(x_test))
Evaluate accuracy¶
from sklearn.metrics import mean_squared_log_error
def rmsle(y, y_pred):
return np.sqrt(mean_squared_log_error(y, y_pred))
#output:
XGBoost RMSLE(train): 0.1626671056150446
XGBoost RMSLE(test): 0.16849945199484243
plt.figure(figsize=(5,5))
plt.scatter(y_test,xgb_test_pred)
plt.xlabel('Actual Y')
plt.ylabel('Predicted Y')
plt.show()

The graph of actual and predicted house price for XGBoost.
Attention
Sentiment analysis¶
In this tutorial, you will learn:
- The basics of sentiment analysis
- How to collect tweets
- How to collect financial news headlines
- What are the common ways of analysing sentiment
- How to measure the accuracy of the sentiment prediction
Intro to sentiment analysis¶
Collection of tweets¶
Apply for developer account from Twitter use Tweepy
Code example
import tweepy
# do not share the API key in any public platform (e.g github, public website)
consumer_key = API secret
consumer_secret = API secret
access_token = Access token
access_token_secret = Access secret
# authorisation of consumer key and consumer secret
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
api = tweepy.API(auth, wait_on_rate_limit=True, wait_on_rate_limit_notify=True)
Access the relevant tweets using the Twitter API
Timeline tweets¶
id
parameter.user_id
or screen_name
parameters to access the user-specified tweets. For more information regarding the parameters,
please visit the official documentation: https://docs.tweepy.org/en/v3.5.0/api.htmlCode example
# create an empty list
alltweets = []
# extract data from the API
timeline = api.user_timeline(user_id=userid, count=number_of_tweets)
alltweets.extend(timeline)
with open('%s_tweets.csv' % screen_name, 'a') as f:
writer = csv.writer(f)
for tweet in alltweets:
tweet_text = tweet.text.encode("utf-8")
dates=tweet.created_at
writer.writerow([dates,tweet_text])
Hashtag/Cashtag tweets¶
tweepy.Cursor()
to access data from hashtag and cashtags.Code example
# extract data from the API
hashtags = tweepy.Cursor(api.search, q=name, lang='en', tweet_mode='extended').items(200)
with open('%s_tweets.csv' % screen_name, 'a') as f:
writer = csv.writer(f)
for status in hashtags:
tweet_text = status.full_text
dates = str(status.created_at)[:10]
writer.writerow([dates,tweet_text])
with open('%s_tweets.csv' % screen_name, 'a') as f:
writer = csv.writer(f)
for status in hashtags:
# Add this line
** if (datetime.datetime.now() - status.created_at).days <= day_required: **
tweet_text = status.full_text
dates = str(status.created_at)[:10]
writer.writerow([dates,tweet_text])
Stream tweets¶
- Create a class inheriting from StreamListener
# override tweepy.StreamListener
class MyStreamListener(tweepy.StreamListener):
# add logic to the on_staus method
def on_status(self, status):
if (self.tweet_count == self.max_tweets):
return False
# collect tweets
else:
tweet_text = status.text
writer = csv.writer(self.output_file)
writer.writerow([status.created_at,status.extended_tweet['full_text'].encode("utf-8")])
self.tweet_count += 1
# add logic to the initialisation function
def __init__(self, output_file=sys.stdout,input_name=sys.stdout):
super(MyStreamListener,self).__init__()
self.max_tweets = 200
self.tweet_count = 100
self.input_name = input_name
- Create a stream
# add an output_file parameter to store the output tweets
myStreamListener = MyStreamListener(output_file=f, input_name=firm)
myStream = tweepy.Stream(auth=api.auth, tweet_mode='extended', listener=myStreamListener, languages=["en"])
- Start a stream
myStream.filter(track=target_firm)
Collect financial headlines¶
US news headlines¶

- Access the website of each ticker through the
urllib.request
module
allnews = []
finviz_url = 'https://finviz.com/quote.ashx?t='
url = finviz_url + ticker
req = Request(url=url, headers={'user-agent': 'my-app/0.0.1'})
- Get the HTML document using Beautiful Soup
html = BeautifulSoup(resp, features="lxml")
- Get the information of <div> id=’news-table’ in the website
news_table = html.find(id='news-table')
news_tables[ticker] = news_table
- Find all the news under the <tr> tag in the news-table
for info in df.findAll('tr'):
text = info.a.get_text()
date_scrape = info.td.text.split()
if (len(date_scrape) == 1):
time = date_scrape[0]
else:
date = date_scrape[0]
time = date_scrape[1]
news_time_st r= date + " " + time
- Convert the date format to ‘YYYY-MM-dd’
date_time_obj = datetime.datetime.strptime(news_time_str, '%b-%d-%y %I:%M%p')
date_time=date_time_obj.strftime('%Y-%m-%d')
- Append all the news together
allnews.append([date_time,text])
HK news headlines¶

date'
attribute is stored within the <div class ='inline_block>
under the
<div class='newstime 4'>
, while the news headlines are stored within the <div class='newscontent4 mar8T'>
.- Access the website of each ticker through
urllib.request
module
prefix_url = 'http://www.aastocks.com/en/stocks/analysis/stock-aafn/'
postfix_url = '/0/all/1'
url = prefix_url + fill_ticker + postfix_url
req = Request(url=url, headers={'user-agent': 'my-app/0.0.1'})
resp = urlopen(req)
- Get the HTML document using Beautiful Soup
html = BeautifulSoup(resp, features="lxml")
# get the html code containing the dates and news
dates = html.findAll("div", {"class": "inline_block"})
news = html.findAll("div", {"class": "newshead4"})
- Find all the news and corresponding dates from the html code from step 2
# track the index in the news list
idx = 0
with open('%s_tweets.csv' % screen_name, 'a') as f:
writer = csv.writer(f)
for i in dates:
# as the dates are in yyyy/mm/dd format
if "/" in str(i.get_text()):
date = str(i.get_text())
# the front-end code is not standardised and sometimes contains 'Release Time' string
if "Release Time" in date:
date = date[13:23]
else:
date = str(date[:10])
text = news[idx].get_text()
date_time_obj = datetime.datetime.strptime(date, '%Y/%m/%d')
# standardise the date format as 'YYYY-mm-dd'
date_time = date_time_obj.strftime('%Y-%m-%d')
# set the number of days you want to collect
if (datetime.datetime.now()-date_time_obj).days <= day_required:
writer.writerow([date_time,text])
idx += 1
VADER sentiment prediction¶
- Positive sentiment (= 2): compound score > 0.01
- Neutral sentiment (= 1): −0.01 ≥ compound score ≤ 0.01
- Negative sentiment (= 0): compound score < −0.01
- Import these libraries
import pandas as pd
import nltk
from nltk.sentiment.vader import SentimentIntensityAnalyzer
from nltk.corpus import twitter_samples
- VADER’s
SentimentIntensityAnalyzer()
takes in a string and returns a dictionary of scores in each of four categories:
- negative
- neutral
- positive
- compound (computed by normalising the scores above, ranging from -1 to 1)
Let us analyse the data that we have collected through the sentimental analyser.
# pass in the path where you stored the csv file containing the data
def read_tweets_us_path(path):
# could change to your own path
path = os.path.join(dir_name,'train-data/'+path)
# read in data as pandas dataframe
df = pd.read_csv(path)
cs = []
for row in range(len(df)):
cs.append(analyzer.polarity_scores(df['tweets'].iloc[row])['compound'])
# create a new column for the calculated results
df['compound_vader_score'] = cs
print(df)
return df
- Label the sentiment for each tweet
Parameters:
grouped_data
: consolidated data with features including (dates, tweets, compound_vader_score)file_name
: the output name after the label functionperc_change
: the threshold value for labelling the sentiment
Code example
def find_tweets_pred_label(grouped_data,file_name,perc_change):
print('find_pred_label')
tweets = grouped_data['tweets']
# group the tweets within the csv using ['dates','ticker'] index,
grouped_data = grouped_data.groupby(['dates','ticker'])['compound_vader_score'].mean().reset_index()
final_label = []
for i in range(len(grouped_data)):
if grouped_data['compound_vader_score'].iloc[i] > perc_change:
final_label.append(2)
elif grouped_data['compound_vader_score'].iloc[i] < -perc_change:
final_label.append(0)
elif ((grouped_data['compound_vader_score'].iloc[i] >= -perc_change) and (grouped_data['compound_vader_score'].iloc[i] <= perc_change)):
final_label.append(1)
# add the column of vader_label
grouped_data['vader_label'] = final_label
grouped_data['tweets'] = tweets
grouped_data.to_csv(file_name)
- actual label (= 2): price movement ≥ 0.01
- actual label (= 1): −0.01 ≥ price movement ≤ 0.01
- actual label (= 0): price movement ≤ −0.01
file_name
: consolidated data with features including (dates,tweets,compound_vader_score)label_data
: the label data contains the actual label from yahoo finance
Code example
def merge_actual_label (file_name,label_data):
vader_data = pd.read_csv(file_name)
vader_data.set_index(keys=["dates","ticker"], inplace=True)
label_data = pd.read_csv(label_data)
label_data.set_index(keys=["dates","ticker"], inplace=True)
# merge the actual label and the predicted label into a single pandas data frame
merge = pd.merge(vader_data,label_data, how='inner', left_index=True, right_index=True)
merge.drop(columns=['Unnamed: 0_y'], axis=1)
return merge
Parameters:
df
: the final merged pandas dataframename
: the output csv file containing all the merged information with dates, tweets, vader label and actual label
Code illustration
from sklearn.metrics import confusion_matrix
import matplotlib.pyplot as plt
def validation(df,name):
pred_label = list(df['vader_label'])
actual_label = list(df['label'])
labels = [0,1,2]
cm = confusion_matrix(actual_label, pred_label,labels)
labels = ['True Neg','False Pos','False Neg','True Pos']
categories = ['Negative','Neutral', 'Positive']
make_confusion_matrix(cm, group_names=labels, categories=categories)
df.to_csv(name)
Attention
Integrated trading strategy¶
In this tutorial, you will learn:
- How to make use of different features to write a strategy
- How to use machine learning models to predict trading signals
Putting it all together¶
Simple approach¶
Baseline model¶
integrated-strategy/baseline.py
.- Generate the signals dataframe using a technical analysis strategy (e.g. MACD)
- Pass the signals to the macroeconomic filter
- Pass the signals to the sentiment filter
- Backtest with the filtered signals dataframe
As usual, we first select the ticker and time range to run the strategy:
# load price data
df_whole = pd.read_csv('../../database/microeconomic_data/hkex_ticks_day/hkex_0001.csv', header=0, index_col='Date', parse_dates=True)
ticker = "0005.HK"
# select time range (for trading)
start_date = pd.Timestamp('2017-01-01')
end_date = pd.Timestamp('2021-01-01')
df = df_whole.loc[start_date:end_date]
filered_df
additionally to calculate the stock price’s
sensitivity to economic indicators.# get filtered df for macro analysis
filtered_df = df_whole.loc[:end_date]
# apply MACD crossover strategy
macd_cross = macdCrossover(df)
macd_fig = macd_cross.plot_MACD()
plt.close() # hide figure
# get signals dataframe
signals = macd_cross.gen_signals()
print(signals.head())
signal_fig = macd_cross.plot_signals(signals)
plt.close() # hide figure
# get ticker's sensitivity to macro data
s_gdp, s_unemploy, s_property = GetSensitivity(filtered_df)
# append signals with macro data
signals = GetMacrodata(signals)
# calculate adjusting factor
signals['macro_factor'] = s_gdp * signals['GDP'] + s_unemploy * signals['Unemployment rate'] + s_property * signals['Property price']
signals['signal'] = signals['signal'] + signals['macro_factor']
# round off signals['signal'] to the nearest integer
signals['signal'] = signals['signal'].round(0)
filtered_signals = SentimentFilter(ticker, signals)
portfolio, backtest_fig = Backtest(ticker, filtered_signals, df)
plt.close() # hide figure
# print stats
print("Final total value: {value:.4f} ".format(value=portfolio['total'][-1]))
print("Total return: {value:.4f}%".format(value=(((portfolio['total'][-1] - portfolio['total'][0])/portfolio['total'][-1]) * 100))) # for analysis
print("No. of trade: {value}".format(value=len(signals[signals.positions == 1])))
# Evaluate strategy
# 1. Portfolio return
returns_fig = PortfolioReturn(portfolio)
returns_fig.suptitle('Baseline - Portfolio return')
#returns_fig.savefig('./figures/baseline_portfolo-return')
plt.show()
# 2. Sharpe ratio
sharpe_ratio = SharpeRatio(portfolio)
print("Sharpe ratio: {ratio:.4f} ".format(ratio = sharpe_ratio))
# 3. Maximum drawdown
maxDrawdown_fig, max_daily_drawdown, daily_drawdown = MaxDrawdown(df)
maxDrawdown_fig.suptitle('Baseline - Maximum drawdown', fontsize=14)
#maxDrawdown_fig.savefig('./figures/baseline_maximum-drawdown')
plt.show()
# 4. Compound Annual Growth Rate
cagr = CAGR(portfolio)
print("CAGR: {cagr:.4f} ".format(cagr = cagr))
Machine learning approach¶
Recurrent Neural Networks¶

The internal structure of an LSTM. [1]
Single-feature LSTM model¶
integrated-strategy/LSTM-train_price-only.py
.data_dir = "../../database/microeconomic_data/hkex_ticks_day/"
# select date range
dates = pd.date_range('2010-01-02','2016-12-31',freq='B')
test_dates = pd.date_range('2017-01-03','2020-09-30',freq='B')
# select ticker
symbol = "0001"
# load data
df = read_data(data_dir, symbol, dates)
df_test = read_data(data_dir, symbol, test_dates)
MinMaxScaler
function from the sklearn.preprocessing
is used to
normalise the input features, i.e. they will be transformed into the range [-1,1] in the
following code snippet.scaler = MinMaxScaler(feature_range=(-1, 1))
df['Close'] = scaler.fit_transform(df['Close'].values.reshape(-1,1))
df_test['Close'] = scaler.fit_transform(df_test['Close'].values.reshape(-1,1))
look_back = 60 # choose sequence length
We can check the shapes of the train and test data:
x_train, y_train, x_test, y_test = load_data(df, look_back)
print('x_train.shape = ',x_train.shape)
print('y_train.shape = ',y_train.shape)
print('x_test.shape = ',x_test.shape)
print('y_test.shape = ',y_test.shape)
And then make the traing and testing sets in torch:
# make training and test sets in torch
x_train = torch.from_numpy(x_train).type(torch.Tensor)
x_test = torch.from_numpy(x_test).type(torch.Tensor)
y_train = torch.from_numpy(y_train).type(torch.Tensor)
y_test = torch.from_numpy(y_test).type(torch.Tensor)
Moving on, let’s set the hyperparameters.
# Hyperparameters
n_steps = look_back - 1
batch_size = 32
num_epochs = 100
input_dim = 1
hidden_dim = 32
num_layers = 2
output_dim = 1
torch.manual_seed(1) # set seed
train = torch.utils.data.TensorDataset(x_train,y_train)
test = torch.utils.data.TensorDataset(x_test,y_test)
train_loader = torch.utils.data.DataLoader(dataset=train,
batch_size=batch_size,
shuffle=False)
test_loader = torch.utils.data.DataLoader(dataset=test,
batch_size=batch_size,
shuffle=False)
model = LSTM(input_dim=input_dim, hidden_dim=hidden_dim, output_dim=output_dim, num_layers=num_layers)
loss_fn = torch.nn.MSELoss()
optimiser = torch.optim.Adam(model.parameters(), lr=0.01)
We’ll write the training loop now:
# Initialise a list to store the losses
hist = np.zeros(num_epochs)
# Number of steps to unroll
seq_dim = look_back - 1
# Train model
for t in range(num_epochs):
# Forward pass
y_train_pred = model(x_train)
loss = loss_fn(y_train_pred, y_train)
if t % 10 == 0 and t !=0:
print("Epoch ", t, "MSE: ", loss.item())
hist[t] = loss.item()
# Zero out gradient, else they will accumulate between epochs
optimiser.zero_grad()
# Backward pass
loss.backward()
# Update parameters
optimiser.step()
# Make predictions
y_test_pred = model(x_test)
# Invert predictions
y_train_pred = scaler.inverse_transform(y_train_pred.detach().numpy())
y_train = scaler.inverse_transform(y_train.detach().numpy())
y_test_pred = scaler.inverse_transform(y_test_pred.detach().numpy())
y_test = scaler.inverse_transform(y_test.detach().numpy())
# Calculate root mean squared error
trainScore = math.sqrt(mean_squared_error(y_train[:,0], y_train_pred[:,0]))
print('Train Score: %.2f RMSE' % (trainScore))
testScore = math.sqrt(mean_squared_error(y_test[:,0], y_test_pred[:,0]))
print('Test Score: %.2f RMSE' % (testScore))
# Inferencing
y_inf_pred, y_inf = predict_price(df_test, model, scaler)
signal = gen_signal(y_inf_pred, y_inf)
# Save signals as csv file
output_df = pd.DataFrame(index=df_test.index)
output_df['signal'] = signal
output_df.index.name = "Date"
output_filename = 'output/' + symbol + '_output.csv'
output_df.to_csv(output_filename)
output-backtester_wrapper.py
in the
same directory that would load all the files in the output directory and run it with the backtester_wrapper
to compute the evaluation metrics.Multi-feature LSTM model¶
integrated-strategy/LSTM-train_wrapper.py
.LSTM_predict
function, as the main
function is simply a wrapper that
calls the LSTM_predict
function with different ticker symbols.# load file
dir_name = os.getcwd()
data_dir = os.path.join(dir_name,"database_real/machine_learning_data/")
sentiment_data_dir=os.path.join(dir_name,"database/sentiment_data/data-result/")
# Get merged df with stock tick and sentiment scores
df, scaled, scaler = merge_data(symbol, data_dir, sentiment_data_dir, strategy)
look_back = 60 # choose sequence length
x_train, y_train, x_test_df, y_test_df = load_data(scaled, look_back)
# make training and test sets in torch
x_train = torch.from_numpy(x_train).type(torch.Tensor)
x_test = torch.from_numpy(x_test_df).type(torch.Tensor)
y_train = torch.from_numpy(y_train).type(torch.Tensor)
y_test = torch.from_numpy(y_test_df).type(torch.Tensor)
# Hyperparameters
num_epochs = 100
lr = 0.01
batch_size = 72
input_dim = 7
hidden_dim = 64
num_layers = 4
output_dim = 7
torch.manual_seed(1) # set seed
print("Hyperparameters:")
print("input_dim: ", input_dim, ", hidden_dim: ", hidden_dim, ", num_layers: ", num_layers, ", output_dim", output_dim)
print("num_epochs: ", num_epochs, ", batch_size: ", batch_size, ", lr: ", lr)
train = torch.utils.data.TensorDataset(x_train,y_train)
test = torch.utils.data.TensorDataset(x_test,y_test)
train_loader = torch.utils.data.DataLoader(dataset=train,
batch_size=batch_size,
shuffle=False)
test_loader = torch.utils.data.DataLoader(dataset=test,
batch_size=batch_size,
shuffle=False)
model = LSTM(input_dim=input_dim, hidden_dim=hidden_dim, output_dim=output_dim, num_layers=num_layers)
loss_fn = torch.nn.MSELoss()
optimiser = torch.optim.Adam(model.parameters(), lr=lr)
hist = np.zeros(num_epochs)
# Number of steps to unroll
seq_dim = look_back - 1
# Train model
for t in range(num_epochs):
for i, (train_data, train_label) in enumerate(train_loader):
# Forward pass
train_pred = model(train_data)
loss = loss_fn(train_pred, train_label)
hist[t] = loss.item()
# Zero out gradient, else they will accumulate between epochs
optimiser.zero_grad()
# Backward pass
loss.backward()
# Update parameters
optimiser.step()
if t % 10 == 0 and t != 0:
y_train_pred = model(x_train)
loss = loss_fn(y_train_pred, y_train)
print("Epoch ", t, "MSE: ", loss.item())
# Make predictions
y_test_pred = model(x_test)
# Invert predictions
y_train_pred = scaler.inverse_transform(y_train_pred.detach().numpy())
y_train = scaler.inverse_transform(y_train.detach().numpy())
y_test_pred = scaler.inverse_transform(y_test_pred.detach().numpy())
y_test = scaler.inverse_transform(y_test.detach().numpy())
# Calculate root mean squared error
trainScore = math.sqrt(mean_squared_error(y_train[:,0], y_train_pred[:,0]))
print('Train Score: %.2f RMSE' % (trainScore))
testScore = math.sqrt(mean_squared_error(y_test[:,0], y_test_pred[:,0]))
print('Test Score: %.2f RMSE' % (testScore))
visualise(df, y_test[:,0], y_test_pred[:,0], pred_filename)
signal_dataframe = gen_signal(y_test_pred[:,0], y_test[:,0], df[len(df)-len(y_test):].index, by_trend=True)
# Save signals as csv file
output_filename = 'LSTM_output_trend/' + symbol + '_output.csv'
signal_dataframe.to_csv(output_filename,index=False)
LSTM_predict
function, for example by calling in this way:LSTM_predict('0001', 'macd-crossover')
output-backtester_wrapper.py
file to backtest the output signals.References
Image sources
[1] | By Christopher Olah - Understanding LSTM Networks, https://colah.github.io/posts/2015-08-Understanding-LSTMs |
Attention
Paper Trading with Interactive Brokers¶
In this tutorial, you will learn:
- What is paper trading
- How to start paper trading with Interactive Brokers
Intro to paper trading¶
Definition
Setup Interactive Brokers API¶
- Visit InteractiveBrokers website, and open an account
- Download IB API software from InteractiveBrokers GitHub account
- Download TWS software from InteractiveBrokers TWS
- Choose an IDE that you code in
- Subscribe to market data
Connect to Interactive Brokers TWS¶
app.connect()
to establish an API connection.class App(EWrapper, EClient):
def __init__(self):
EClient.__init__(self, self)
# Establish API connection
# app.connect(ipAddress, portNumber, clientId)
app = App()
app.connect('127.0.0.1', 7497, 0)
app.run()

Create Basic Contracts¶
from ibapi.contract import Contract
# Create contracts - stocks
tsla_contract = Contract()
tsla_contract.symbol = "TSLA"
tsla_contract.secType = "STK"
tsla_contract.exchange = "ISLAND"
tsla_contract.currency = "USD"
# Create contracts - fx pairs
eurgbp_contract = Contract()
eurgbp_contract.symbol = "EUR"
eurgbp_contract.secType = "CASH"
eurgbp_contract.currency = "GBP"
eurgbp_contract.exchange = "IDEALPRO"
Request Market Data¶
Request Streaming Market Data¶
app.reqMktData()
to request streaming market data.class App(EWrapper, EClient):
# Receive market data
def tickPrice(self, tickerId, field, price, attribs):
print("Tick Price. Ticker Id:", tickerId, ", TickType: ", TickTypeEnum.to_str(field),
", Price: ", price, ", CanAutoExecute: ", attribs.canAutoExecute,
", PastLimit: ", attribs.pastLimit, ", PreOpen: ", attribs.preOpen)
# Request market data
# app.reqMktData(tickerId, contract, genericTickList, snapshot, regulatorySnaphsot, mktDataOptions)
app.reqMktData(1, tsla_contract, '', False, False, None)
app.reqMarketDataType(3)
to switch market data type to delayed data.# Switch market data type
# 3 for delayed data
app.reqMarketDataType(3)
Request Historical Market Data¶
app.reqHistoricalData()
to request historical bar data.class App(EWrapper, EClient):
# Receive historical bar data
def historicalData(self, reqId, bar):
print("HistoricalData. ReqId:", reqId, "BarData.", bar)
# Request historical bar data
# app.reqHistoricalData(tickerId, contract, endDateTime, durationString, barSizeSetting, whatToShow, useRTH, formatDate, keepUpToDate)
app.reqHistoricalData(1, eurgbp_contract, '', '1 M', '1 day', 'ASK', 1, 1, False, None)
Manage Orders¶
class App(EWrapper, EClient):
def nextValidId(self, orderId: int):
super().nextValidId(orderId)
self.nextorderId = orderId
print('The next valid order id is: ', self.nextorderId)
def orderStatus(self, orderId, status, filled, remaining, avgFillPrice, permId, parentId,
lastFillPrice, clientId, whyHeld, mktCapPrice):
print("OrderStatus. Id: ", orderId, ", Status: ", status, ", Filled: ", filled,
", Remaining: ", remaining, ", AvgFillPrice: ", avgFillPrice,
", PermId: ", permId, ", ParentId: ", parentId, ", LastFillPrice: ", lastFillPrice,
", ClientId: ", clientId, ", WhyHeld: ", whyHeld, ", MktCapPrice: ", mktCapPrice)
def openOrder(self, orderId, contract, order, orderState):
print("OpenOrder. PermID: ", order.permId, ", ClientId: ", order.clientId,
", OrderId: ", orderId, ", Account: ", order.account, ", Symbol: ", contract.symbol,
", SecType: ", contract.secType, " , Exchange: ", contract.exchange,
", Action: ", order.action, ", OrderType: ", order.orderType,
", TotalQty: ", order.totalQuantity, ", CashQty: ", order.cashQty,
", LmtPrice: ", order.lmtPrice, ", AuxPrice: ", order.auxPrice,
", Status: ", orderState.status)
def execDetails(self, reqId, contract, execution):
print("ExecDetails. ", reqId, " - ", contract.symbol, ", ", contract.secType,
", ", contract.currency, " - ", execution.execId, ", ", execution.orderId,
", ", execution.shares , ", ", execution.lastLiquidity)
Place Orders¶
app.placeOrder()
to submit an order.# Place order
# app.placeOrder(orderId, contract, order)
app.placeOrder(app.nextorderId, eurgbp_contract, order)
Modify Orders¶
app.placeOrder()
again with the order id to be
modified and the updated parameters.# Modify order
order_id = 1
order.lmtPrice = '0.82'
app.placeOrder(order_id, eurgbp_contract, order)
Cancel Orders¶
app.cancelOrder()
.app.reqGlobalCancel()
.# Cancel order by order Id
app.cancelOrder(app.nextorderId)
# Cancel all open orders
app.reqGlobalCancel()
Request Account Summary¶
app.reqAccountSummary()
to get the summarized account information.class App(EWrapper, EClient):
# Receive account summary
def accountSummary(self, reqId:int, account:str, tag:str, value:str, currency:str):
print("Acct Summary. ReqId:" , reqId , "Acct:", account, "Tag: ", tag, "Value:", value,
"Currency:", currency)
# Request account summary in base currency
app.reqAccountSummary(9002, "All", "$LEDGER");
# Request account summary in HKD
app.reqAccountSummary(9002, "All", "$LEDGER:HKD");
References
Attention
Resources¶
Acknowledgement¶
About the author¶
- A day in the life of a female in tech at Morgan Stanley
- Google Women Techmakers
- And more to be updated…
Indices and tables¶
Attention