Latest entries from the Blog

StatArb in Forex

I will be talking in this article about a strategy that has taken a lot of years swarming over the internet: statistical arbitrage through analysis of the cointegration of 2 Forex pairs.

This code is based in the one you will find here. Well, and also in the linkedin of Jacques Joubert, and in his Github. I take this opportunity to comment also that this strategy I found thanks to Quantocracy, a website that links a handful of Quant sites that are worth having added to your favorites and visit every day. It was also in Quantocracy where I found the page of Tulip Quant that originated the ARIMA + MonteCarlo article (for now, only in the spanish version of Investingdev).

If you want to try Jacques Joubert’s code, follow the instructions in his article. If you use R or RStudio under Linux, you should rename some variables, because Linux is case sensitive. But other than that, the code works perfectly.

If you want to test the code that I have modified and to be able to use it in Forex and with timeframe inferior to D1 you will need, as in the previous article, a key of the API of Oanda and to modify the corresponding variables in the code. An important note at this point is that Oanda has recently launched new accounts called V20 and that apparently they are different in some way to the one that I have (the older ones). I can not assure you that the code I publish here works under V20 accounts. I will be grateful if anyone test the code with a V20 account, please tell me your results and impressions. if I see enough interest, I will also get a fix for Oanda’s V20 accounts.

To make a backtest, it is enough to write

> eurusd.usdchf <- prepareData("EUR_USD","USD_CHF","M15","2015-01-01","2016-09-21")

eurusd.usdchf will store the historical candles of EURUSD and USDCHF of M15, from January 1st, 2015 until September 21st, 2016, with columns of date and time, closing price of par1 and closing price of Par2 (3 columns in total).

Then we will do the BT of the in-sample period until January 01, 2016:

> BT.eurusd.usdchf <- BacktestPair(eurusd.usdchf,mean=200,slippage=0.00050,criticalValue=-3,startDate="2015-01-01",endDate="2016-01-01")

That makes a BT of EURUSD – USDCHF, with inputs given by Z-Score crosses with 200 period average, a slippage of 5 pips, which we could consider as commission expenses, a criticalValue of -3 to further limit the test ADF cointegration, and the start and end dates. This BT uses only a part of the downloaded history, because the rest of the period will be for out-of-sample BT.

This is the result:


To make the Out Of Sample (OOS) backtest:

> BT.eu.oos <- xts(BT.eurusd.usdchf[,18]*10,BT.eurusd.usdchf$Date)
> GenerateReport.xts(BT.eu.oos,startDate="2016-01-01",endDate="2016-09-21")

And this is another BT, for EURUSD-GBPUSD with M1 candles. The truth is that the following graph is only In-Sample:


This is a promising strategy, isn´t it?

If you want the R code of this strategy, please share!

PS. And, of course, thanks to Jacques Joubert for sharing this strategy, and for answering my questions! It was very kinf of you!

Intro to Programming 2

Today we will talk about how the computer handles the information. I said in the previous post that what the computer sees are numbers.

In computing, the quantum of information, or the minimum amount of information is one bit. A bit may be a ‘0’ or a ‘1’. How can we shape reality and many other things on the computer, only with ones and zeroes?

Well, let’s start from the beginning.

Think that information on a computer is stored in electronic media: hard disk, or RAM. How is information stored within these devices? These are magnetic devices. Without  dwelling much, do you remember that magnets have a positive and a negative pole? So the data storage device on a computer have a lot of “cells” that can be magnetized in one way or another, so that each cell stores a bit according to its magnetic state.

But it would be difficult to manage information only with bits. Like a kilogram is 1000 grams, with bits we do the same: we use other measures to group bits into blocks to facilitate our use.

Thus, 8 bits make one byte.

1024 bytes make a kilobyte (kB), also called in computer jargon simply “k”.

1024 kB make one megabyte (MB), also called “mega”

1024 MB make one terabyte (TB) or “tera”

1024 TB make a petabyte (PB) or “peta”

And these groups so rare, why are them like this? Why a 1000 kB are bytes and not just bits? We  should make life easier, right?

Well, this is because the quantum information is a bit. Therefore, a variable of a bit can have two possible values. How many possible values have with 2 bits? 4, right? And if we have 3 bits? 8 What if we have 4 bits? 16.

What happens is that we work on computer with base 2, instead of working in base 10, which is what we humans use.

And so these groups of bits are powers of 2 instead of powers of 10:

1 byte is 2 ^ 3 bits (2 to 3)
1 kB is 2 ^ 13 bits (and 2 ^ 10 bytes, which is remembered easier)
1MB is 2 ^ 20 bytes (and 2 ^ 10 kB)
1 GB are 2 ^ 30 bytes (2 ^ 10 MB)
Knowing this, now perhaps you may understand better why when you buy a “2 Teras” hard disc then you see the actual size in your Operating System and it’s a different one -smaller size always-… Because what you have bought are 2000 “gigas” although actually 2 Teras are 2048 GB …

I give you an example. Say I have a drive in my computer of 8TB. When I see the size of this drive in an operating system like Linux (# df ), I see this:

S.ficheros 1K-blocks Used Available Use% Mounted on
pppppppp / 7999999992 7999999992 0 0% / mnt / backup

Think what that means: The 2nd column are “1K blocks.” I mean, I have 8 “Teras” and 8,000 “gigas”, or “8 million megas”. There’s only a lack of 8 “kas” that have gone “to the limbo”. Forget those missing 8 kB

However, if instead of that command (#df) I use the -h (#df -h) option, of human readable, this is what I see:

Size Used Avail S.ficheros Use% Mounted on
PPPPPPP / 7,5T 7,5T 0 0% / mnt / backup

That is, about 7 and a half of ‘usable’ disc space.

You must also be careful when you see advertisements of internet lines: 100 “megas” bandwith. Well, almost always we read about these internet speeds are expressed in bits, not bytes. If we have to transmit a file of 1 kB (1024 bytes) through a line with a bandwith of 1 kbps (1 kilobit per second) it would take more than 1 second. In ideal conditions. When we hire a line of “12 Mb” (Mb is often used instead of MB or Gb rather than GB  to denote this difference), that is “12 megabit per second” and not “megabytes”. It is a difference to take into consideration.

Thus, the computer uses hexadecimal numbering system for encoding memory addresses and for many other tasks. The hexadecimal system, unlike the binary system, which is base 2, or decimal, which is base 10, uses base 16. The minimum amount of information in a binary system is the bit and can take 2 values. In the decimal system this minimum amount of information is a digit and can take 10 possible values (zero to nine). In a hexadecimal numbering system the minimum amount of information is a special digit that can take 16 possible values: 0 (zero), 1, 2, 3 , 4, 5, 6, 7, 8, 9, A, B, C, D, E and F.

I will not extend much more on how to work with hexadecimal numbers or how to convert from decimal to hexadecimal and vice versa, but if you want you can dig a little Google to understand.

Referring to the memory addresses of a computer’s memory, and with that the computer uses internally the hexadecimal encoding for these addresses could be perfectly valid a memory address like 570AC7F. Often hexadecimal numbers in computer science are written with a “0x” (zero X) to note the difference and see clearly that it is a hexadecimal number. For example: 0x570AC7F.

I leave as an optional task to convert that number to decimal …

Intro to Programming

If we want to learn to program, especially learning to program for investment strategies, we must start from the beginning. This is, we’ll begin by explainig what is  a program, which is a programming language, what types of languages exist …

So today, to start, we will explain some concepts.

What is a program?

A program is nothing more than a set of instructions that tells the computer what to do . But this definition is very diffuse. A lot. That instruction, what is it? How should it be? And, why tell the computer what to do? What do you mean by that?

Well, let’s look at an example. We want to make a program that reads the calendar if we have any pending task for today and if so, please send us an email. These would be the instructions of the program:

 If the current time is 07:00 am THEN
   Today <- date today
   Task<- Calendar.today.task
   IF task is not NULL THEN
     send email task


I have written each instruction in a line, for clarity.

The instructions should be atomic (only do one thing) and precise (not confusing). It is also desirable to be clear (instructions can be confusing if they are too long, for example), but this is a more subjective criterion.

These instructions here are written in a language called pseudocode invented. This is a language halfway between our language and programming language (C, C ++, MQL4, or whatever we’re going to use to set). You can come in handy when we are learning programming pseudocode write our programs will help structure our mind and our program. Programming has more of em> knowing programming structures than only language syntax . That is, once you know how to program a language, and know well structures, and how to make a program , you will be very easily learn a new programming language. The really difficult thing is to learn the structures and the most basic concepts.

The second instruction (the second line) is saying store the value of the day it is today in the variable Today. The third line stores the value of the task today calendar in the variable Task .

This syntax that you just used seem confusing if you’ve never programmed. Why do I have to keep the value of today in a variable? It is that the computer does not know the day it is? What is a Variable ?

Well, no. The computer does not know anything that we do not say it. The computer can read answers and ask questions, read data and write data, but know, know … no. Ourselves, if we want to know what day it is today, we look at the calendar. In other words, our brain assigns a little space inside to save ” today is the day August 11, 2016″ . This memory space is just a space that our brains will re-use at another time, maybe the next day when we remember the new date. And when we need to remember the day it is today, the brain will access that space in our memory to retrieve that data previously memorized.

This is analogous to what is a variable: a memory space inside the computer, a data storage, which can vary its value throughout the execution of a program . These variables have a name, to make it easier for us to use. In our example, the names of the variables are Today and Task. I’m saying that this name is actually to make it easier for us to use, because, remember, a variable is only a memory space inside the computer. If it was only the computer that needed to access these memory cells, we could use a numeric code as address of that variable. For example, a very large number,for instance, the memory locations 1 million, 1 million 1, 1 million 2, etc etc. Or we could use an encoding such as car license plates with numbers and letters, to the positions of computer memory: 8989BB, 7777FF …

The truth is that the computer internally works like this. Inside the computer, there are numbers. Memory addresses are numbers for the computer, including the value of these variables are numbers. Although the variable is task , and its value is ask the doctor appointment for next week, what the computer see it’s only numbers.

While the next article comes, on Introduction to Programming , you might want to think a program of how your day goes(1.- sounds the alarm. 2. I get up. 3. I shower. ..)

Try typing three or four programs in pseudocode about your daily tasks, or whatever you want, to train in this way of thinking. You will see that with practice, whenever you find it easier, and also help you when you face a complex problem, do not look so complex, because you separate that complex problem into several simple tasks. Mind, it is easy 😉


Latest products from the Store

EA Set And Forget

Automated strategy ( EA ) that was developed in our course of mql4 programming (Spanish version, English version is next to come), only this version adds some improvements : control of maximum spread of the trades, control minimum size of candle to take it into account for the strategy or will not open , control of daily percentage of profits or losses for the day that we will close the operation. We saw in the course of programming : from an optimization of an in -sample period of about two months , we chose a promising set and we ran then a backtest . This is a backtest of the strategy since 2010, with fixed lotaje and the set obtained from the optimization of two months . That is, a period out-of -sample immense for an in -sample so small.