Detecting Exoplanets in Satellite Images with Cutting Edge Algorithms and Big Data Analysis


The TESS and Kepler space telescopes aimed to capture as much of the night-sky in their observations as possible. The TESS telescope’s range of coverage is shown in the diagram below (MIT, 2015). 

Most points in the sky have weeks worth of images captured to track changes in stars over time. Using this data, it is possible to find planets orbiting other stars using the transit method.

The transit method detects the dip in brightness of a star as a planet passes in front of it. Reliability of finding planets using this technique hinges on the fact that planets have fixed orbital periods. So if we find a dip in brightness of a star every 14.5 days, it means that there is most probably a planet orbiting that star with the orbit lasting 14.5 days.

In this project, I take some experimental preprocessing techniques and use HPC to quickly and easily get results using those pre-processing techniques on TESS’s data repository.

Setting Up the Job

Short intro

  1. We will use the Jobs tab to create an HPC job which then runs on the cloud with our desired software and hardware stack. To do this, navigate to the Jobs tab and select Create New Job.
  1. Upload the input files. The script consists of three parts, the utility file (, the target list (csv-file-toi-catalog.csv) and the python script ( The python script schedules the work over multiple cores, taking advantage of the MPI enabled Anaconda environment. After this step, you should see the files uploaded like shown
  1. Go to Software Settings and select Miniconda as the software.
  1. Set up miniconda by writing commands that run on initialization. This includes installing the necessary packages and running the main file. The conda environment is already activated when these commands are run:
conda install -y -c anaconda numpy
conda install -y -c astropy astroquery
conda install -y -c conda-forge astropy
conda install -y -c conda-forge matplotlib
conda install -y -c anaconda scipy
conda install -y mpi4py
  1. Finally add the following command to run the scripts, replace #NUM_CORES with the number of cores you wish to eventually allocate (18 in this example). Important: Make sure your scripts do not require user input/confirmation, in conda this is done with the -y flag.
mpirun -n #NUM_CORES python

  1. In the Hardware Selection page, we select the coretype and walltime. This analysis will use the Emerald coretype with 18 cores and 4 hours of walltime. Scroll down further to see other core options to judge which core has specifications desirable for your job. We can also use performance benchmarking to find the optimal core for price or speed. Click Submit to start job.


After the script has run, the results would be stored in the file out.csv. Navigate to the Results tab and search for out.csv as shown below. The file contains two entries for each TESS Object of Interest: Orbital Period of the possible exoplanet and the Power of the periodogram. The second value can be interpreted as the rough measurement of confidence for a repeated pattern around that star.

We can then fold the light curves according to the guessed Orbital Period and see the characteristic dip in brightness that signals the existence of a planet. Some promising examples are shown below. These candidates would then be passed on for further review by other methods.


MIT. (2015). TESS. TESS – Transiting Exoplanet Survey Satellite. Retrieved July 15, 2022, from