The easiest way to install Camelot is to install it with conda, which is a package manager and environment management system for the Anaconda distribution. $ conda install -c conda-forge camelot-py Using pip. After installing the dependencies (tk and ghostscript), you can simply use pip to install Camelot: $ pip install camelot-pycv. Mix #3 has the same one BPM and one keycode difference in sequence #1 (127/5A to 126/6A) and sequence #2 (127/5A to 128/6A). Mix #4 also has the same 2 BPM difference. Mix #5 is better in sequence #1 than in sequence #2. Camelot meets Storybrooke in Season 5a of Once Upon a Time Once Upon a Time started out based on the characters in the fairy tale of Snow White, then episode by episode expanded the cast to include other stories, sometimes even letting us visit those other lands.
This page covers some of the more advanced configurations for Lattice and Stream.
Process background lines¶
To detect line segments, Lattice needs the lines that make the table to be in the foreground. Here’s an example of a table with lines in the background:
Source: PDF
To process background lines, you can pass process_background=True.
Tip
Here’s how you can do the same with the command-line interface.
State | Date | Halt stations | Halt days | Persons directly reached(in lakh) | Persons trained | Persons counseled | Persons testedfor HIV |
Delhi | 1.12.2009 | 8 | 17 | 1.29 | 3,665 | 2,409 | 1,000 |
Rajasthan | 2.12.2009 to 19.12.2009 | ||||||
Gujarat | 20.12.2009 to 3.1.2010 | 6 | 13 | 6.03 | 3,810 | 2,317 | 1,453 |
Maharashtra | 4.01.2010 to 1.2.2010 | 13 | 26 | 1.27 | 5,680 | 9,027 | 4,153 |
Karnataka | 2.2.2010 to 22.2.2010 | 11 | 19 | 1.80 | 5,741 | 3,658 | 3,183 |
Kerala | 23.2.2010 to 11.3.2010 | 9 | 17 | 1.42 | 3,559 | 2,173 | 855 |
Total | 47 | 92 | 11.81 | 22,455 | 19,584 | 10,644 |
Visual debugging¶
Note
Visual debugging using plot() requires matplotlib which is an optional dependency. You can install it using $pipinstallcamelot-py[plot].
You can use the plot() method to generate a matplotlib plot of various elements that were detected on the PDF page while processing it. This can help you select table areas, column separators and debug bad table outputs, by tweaking different configuration parameters.
You can specify the type of element you want to plot using the kind keyword argument. The generated plot can be saved to a file by passing a filename keyword argument. The following plot types are supported:
‘text’
‘grid’
‘contour’
‘line’
‘joint’
‘textedge’
Note
‘line’ and ‘joint’ can only be used with Lattice and ‘textedge’ can only be used with Stream.
Let’s generate a plot for each type using this PDF as an example. First, let’s get all the tables out.
text¶
Let’s plot all the text present on the table’s PDF page.
Tip
Here’s how you can do the same with the command-line interface.
This, as we shall later see, is very helpful with Stream for noting table areas and column separators, in case Stream does not guess them correctly.
Note
The x-y coordinates shown above change as you move your mouse cursor on the image, which can help you note coordinates.
table¶
Let’s plot the table (to see if it was detected correctly or not). This plot type, along with contour, line and joint is useful for debugging and improving the extraction output, in case the table wasn’t detected correctly. (More on that later.)
Tip
Here’s how you can do the same with the command-line interface.
The table is perfect!
contour¶
Now, let’s plot all table boundaries present on the table’s PDF page.
Tip
Here’s how you can do the same with the command-line interface.
line¶
Cool, let’s plot all line segments present on the table’s PDF page.
Tip
Here’s how you can do the same with the command-line interface.
joint¶
Finally, let’s plot all line intersections present on the table’s PDF page.
Tip
Here’s how you can do the same with the command-line interface.
textedge¶
You can also visualize the textedges found on a page by specifying kind='textedge'. To know more about what a “textedge” is, you can see pages 20, 35 and 40 of Anssi Nurminen’s master’s thesis.
Tip
Here’s how you can do the same with the command-line interface.
Specify table areas¶
In cases such as these, it can be useful to specify exact table boundaries. You can plot the text on this page and note the top left and bottom right coordinates of the table.
Table areas that you want Camelot to analyze can be passed as a list of comma-separated strings to read_pdf(), using the table_areas keyword argument.
Tip
Here’s how you can do the same with the command-line interface.
One Withholding | |
Payroll Period | Allowance |
Weekly | $71.15 |
Biweekly | 142.31 |
Semimonthly | 154.17 |
Monthly | 308.33 |
Quarterly | 925.00 |
Semiannually | 1,850.00 |
Annually | 3,700.00 |
Daily or Miscellaneous | 14.23 |
(each day of the payroll period) |
Note
table_areas accepts strings of the form x1,y1,x2,y2 where (x1, y1) -> top-left and (x2, y2) -> bottom-right in PDF coordinate space. In PDF coordinate space, the bottom-left corner of the page is the origin, with coordinates (0, 0).
Specify table regions¶
However there may be cases like [1] and [2], where the table might not lie at the exact coordinates every time but in an approximate region.
You can use the table_regions keyword argument to read_pdf() to solve for such cases. When table_regions is specified, Camelot will only analyze the specified regions to look for tables.
Tip
Here’s how you can do the same with the command-line interface.
Età dell’Assicuratoall’epoca del decesso | Misura % dimaggiorazione |
18-75 | 1,00% |
76-80 | 0,50% |
81 in poi | 0,10% |
Specify column separators¶
In cases like these, where the text is very close to each other, it is possible that Camelot may guess the column separators’ coordinates incorrectly. To correct this, you can explicitly specify the x coordinate for each column separator by plotting the text on the page.
You can pass the column separators as a list of comma-separated strings to read_pdf(), using the columns keyword argument.
In case you passed a single column separators string list, and no table area is specified, the separators will be applied to the whole page. When a list of table areas is specified and you need to specify column separators as well, the length of both lists should be equal. Each table area will be mapped to each column separators’ string using their indices.
For example, if you have specified two table areas, table_areas=['12,54,43,23','20,67,55,33'], and only want to specify column separators for the first table, you can pass an empty string for the second table in the column separators’ list like this, columns=['10,120,200,400','].
Let’s get back to the x coordinates we got from plotting the text that exists on this PDF, and get the table out!
Tip
Here’s how you can do the same with the command-line interface.
… | … | … | … | … | … | … | … | … | … |
LICENSE | PREMISE | ||||||||
NUMBER TYPE DBA NAME | LICENSEE NAME | ADDRESS | CITY | ST | ZIP | PHONE NUMBER | EXPIRES | ||
… | … | … | … | … | … | … | … | … | … |
Ah! Since PDFMiner merged the strings, “NUMBER”, “TYPE” and “DBA NAME”, all of them were assigned to the same cell. Let’s see how we can fix this in the next section.
Split text along separators¶
To deal with cases like the output from the previous section, you can pass split_text=True to read_pdf(), which will split any strings that lie in different cells but have been assigned to a single cell (as a result of being merged together by PDFMiner).
Tip
Here’s how you can do the same with the command-line interface.
… | … | … | … | … | … | … | … | … | … |
LICENSE | PREMISE | ||||||||
NUMBER | TYPE | DBA NAME | LICENSEE NAME | ADDRESS | CITY | ST | ZIP | PHONE NUMBER | EXPIRES |
… | … | … | … | … | … | … | … | … | … |
Flag superscripts and subscripts¶
There might be cases where you want to differentiate between the text and superscripts or subscripts, like this PDF.
In this case, the text that other tools return, will be 24.912. This is relatively harmless when that decimal point is involved. But when it isn’t there, you’ll be left wondering why the results of your data analysis are 10x bigger!
You can solve this by passing flag_size=True, which will enclose the superscripts and subscripts with <s></s>, based on font size, as shown below.
Tip
Here’s how you can do the same with the command-line interface.
… | … | … | … | … | … | … | … | … | … | … |
Karnataka | 22.44 | 19.59 | 2.86 | 1.22 | 0.89 | 0.69 | ||||
Kerala | 29.03 | 24.91<s>2</s> | 4.11 | 1.77 | 0.48 | 1.45 | ||||
Madhya Pradesh | 27.13 | 23.57 | 3.56 | 0.38 | 1.86 | 1.28 | ||||
… | … | … | … | … | … | … | … | … | … | … |
Strip characters from text¶
You can strip unwanted characters like spaces, dots and newlines from a string using the strip_text keyword argument. Take a look at this PDF as an example, the text at the start of each row contains a lot of unwanted spaces, dots and newlines.
Tip
Here’s how you can do the same with the command-line interface.
… | … | … | … | … | … | … | … | … | … |
Forcible rape | 17.5 | 2.6 | 14.9 | 17.2 | 2.5 | 14.7 | – | – | – |
Robbery | 102.1 | 25.5 | 76.6 | 90.0 | 22.9 | 67.1 | 12.1 | 2.5 | 9.5 |
Aggravated assault | 338.4 | 40.1 | 298.3 | 264.0 | 30.2 | 233.8 | 74.4 | 9.9 | 64.5 |
Property crime | 1,396 .4 | 338 .7 | 1,057 .7 | 875 .9 | 210 .8 | 665 .1 | 608 .2 | 127 .9 | 392 .6 |
Burglary | 240.9 | 60.3 | 180.6 | 205.0 | 53.4 | 151.7 | 35.9 | 6.9 | 29.0 |
… | … | … | … | … | … | … | … | … | … |
Improve guessed table areas¶
While using Stream, automatic table detection can fail for PDFs like this one. That’s because the text is relatively far apart vertically, which can lead to shorter textedges being calculated.
Note
To know more about how textedges are calculated to guess table areas, you can see pages 20, 35 and 40 of Anssi Nurminen’s master’s thesis.
Let’s see the table area that is detected by default.
Tip
Here’s how you can do the same with the command-line interface.
To improve the detected area, you can increase the edge_tol (default: 50) value to counter the effect of text being placed relatively far apart vertically. Larger edge_tol will lead to longer textedges being detected, leading to an improved guess of the table area. Let’s use a value of 500.
Tip
Here’s how you can do the same with the command-line interface.
As you can see, the guessed table area has improved!
Improve guessed table rows¶
You can pass row_tol=<+int> to group the rows closer together, as shown below.
Clave | Clave | Clave | |||
Nombre Entidad | Nombre Municipio | Nombre Localidad | |||
Entidad | Municipio | Localidad | |||
01 | Aguascalientes | 001 | Aguascalientes | 0094 | Granja Adelita |
01 | Aguascalientes | 001 | Aguascalientes | 0096 | Agua Azul |
01 | Aguascalientes | 001 | Aguascalientes | 0100 | Rancho Alegre |
Tip
Camelot 5a Hair Salon
Here’s how you can do the same with the command-line interface.
Clave | Nombre Entidad | Clave | Nombre Municipio | Clave | Nombre Localidad |
Entidad | Municipio | Localidad | |||
01 | Aguascalientes | 001 | Aguascalientes | 0094 | Granja Adelita |
01 | Aguascalientes | 001 | Aguascalientes | 0096 | Agua Azul |
01 | Aguascalientes | 001 | Aguascalientes | 0100 | Rancho Alegre |
Detect short lines¶
There might be cases while using Lattice when smaller lines don’t get detected. The size of the smallest line that gets detected is calculated by dividing the PDF page’s dimensions with a scaling factor called line_scale. By default, its value is 15.
As you can guess, the larger the line_scale, the smaller the size of lines getting detected.
Warning
Making line_scale very large (>150) will lead to text getting detected as lines.
Here’s a PDF where small lines separating the the headers don’t get detected with the default value of 15.
Let’s plot the table for this PDF.
Clearly, the smaller lines separating the headers, couldn’t be detected. Let’s try with line_scale=40, and plot the table again.
Tip
Here’s how you can do the same with the command-line interface.
Voila! Camelot can now see those lines. Let’s get our table.
Investigations | No. ofHHs | Age/Sex/Physiological Group | Preva-lence | C.I* | RelativePrecision | Sample sizeper State |
Anthropometry | 2400 | All … | ||||
Clinical Examination | ||||||
History of morbidity | ||||||
Diet survey | 1200 | All … | ||||
Blood Pressure # | 2400 | Men (≥ 18yrs) | 10% | 95% | 20% | 1728 |
Women (≥ 18 yrs) | 1728 | |||||
Fasting blood glucose | 2400 | Men (≥ 18 yrs) | 5% | 95% | 20% | 1825 |
Women (≥ 18 yrs) | 1825 | |||||
Knowledge &Practices on HTN &DM | 2400 | Men (≥ 18 yrs) | 1728 | |||
2400 | Women (≥ 18 yrs) | 1728 |
Shift text in spanning cells¶
By default, the Lattice method shifts text in spanning cells, first to the left and then to the top, as you can observe in the output table above. However, this behavior can be changed using the shift_text keyword argument. Think of it as setting the gravity for a table — it decides the direction in which the text will move and finally come to rest.
shift_text expects a list with one or more characters from the following set: (',l','r','t','b'), which are then applied in order. The default, as we discussed above, is ['l','t'].
We’ll use the PDF from the previous example. Let’s pass shift_text=['], which basically means that the text will experience weightlessness! (It will remain in place.)
Investigations | No. ofHHs | Age/Sex/Physiological Group | Preva-lence | C.I* | RelativePrecision | Sample sizeper State |
Anthropometry | ||||||
Clinical Examination | 2400 | All … | ||||
History of morbidity | ||||||
Diet survey | 1200 | All … | ||||
Men (≥ 18yrs) | 1728 | |||||
Blood Pressure # | 2400 | Women (≥ 18 yrs) | 10% | 95% | 20% | 1728 |
Men (≥ 18 yrs) | 1825 | |||||
Fasting blood glucose | 2400 | Women (≥ 18 yrs) | 5% | 95% | 20% | 1825 |
Knowledge &Practices on HTN & | 2400 | Men (≥ 18 yrs) | 1728 | |||
DM | 2400 | Women (≥ 18 yrs) | 1728 |
No surprises there — it did remain in place (observe the strings “2400” and “All the available individuals”). Let’s pass shift_text=['r','b'] to set the gravity to right-bottom and move the text in that direction.
Tip
Here’s how you can do the same with the command-line interface.
Investigations | No. ofHHs | Age/Sex/Physiological Group | Preva-lence | C.I* | RelativePrecision | Sample sizeper State |
Anthropometry | ||||||
Clinical Examination | ||||||
History of morbidity | 2400 | All … | ||||
Diet survey | 1200 | All … | ||||
Men (≥ 18yrs) | 1728 | |||||
Blood Pressure # | 2400 | Women (≥ 18 yrs) | 10% | 95% | 20% | 1728 |
Men (≥ 18 yrs) | 1825 | |||||
Fasting blood glucose | 2400 | Women (≥ 18 yrs) | 5% | 95% | 20% | 1825 |
2400 | Men (≥ 18 yrs) | 1728 | ||||
Knowledge &Practices on HTN &DM | 2400 | Women (≥ 18 yrs) | 1728 |
Copy text in spanning cells¶
You can copy text in spanning cells when using Lattice, in either the horizontal or vertical direction, or both. This behavior is disabled by default.
copy_text expects a list with one or more characters from the following set: ('v','h'), which are then applied in order.
Let’s try it out on this PDF. First, let’s check out the output table to see if we need to use any other configuration parameters.
Sl. No. | Name of State/UT | Name of District | Disease/ Illness | No. of Cases | No. of Deaths | Date of start of outbreak | Date of reporting | Current Status | … |
1 | Kerala | Kollam |
| 19 | 0 | 31/12/13 | 03/01/14 | Under control | … |
2 | Maharashtra | Beed |
| 11 | 0 | 03/01/14 | 04/01/14 | Under control | … |
3 | Odisha | Kalahandi |
| 42 | 0 | 02/01/14 | 03/01/14 | Under control | … |
4 | West Bengal | West Medinipur |
| 145 | 0 | 04/01/14 | 05/01/14 | Under control | … |
Birbhum |
| 199 | 0 | 31/12/13 | 31/12/13 | Under control | … | ||
Howrah |
| 85 | 0 | 26/12/13 | 27/12/13 | Under surveillance | … |
We don’t need anything else. Now, let’s pass copy_text=['v'] to copy text in the vertical direction. This can save you some time by not having to add this step in your cleaning script!
Tip
Here’s how you can do the same with the command-line interface.
Sl. No. | Name of State/UT | Name of District | Disease/ Illness | No. of Cases | No. of Deaths | Date of start of outbreak | Date of reporting | Current Status | … |
1 | Kerala | Kollam |
| 19 | 0 | 31/12/13 | 03/01/14 | Under control | … |
2 | Maharashtra | Beed |
| 11 | 0 | 03/01/14 | 04/01/14 | Under control | … |
3 | Odisha | Kalahandi |
| 42 | 0 | 02/01/14 | 03/01/14 | Under control | … |
4 | West Bengal | West Medinipur |
| 145 | 0 | 04/01/14 | 05/01/14 | Under control | … |
4 | West Bengal | Birbhum |
| 199 | 0 | 31/12/13 | 31/12/13 | Under control | … |
4 | West Bengal | Howrah |
| 85 | 0 | 26/12/13 | 27/12/13 | Under surveillance | … |
Tweak layout generation¶
Camelot is built on top of PDFMiner’s functionality of grouping characters on a page into words and sentences. In some cases (such as #170 and #215), PDFMiner can group characters that should belong to the same sentence into separate sentences.
To deal with such cases, you can tweak PDFMiner’s LAParams kwargs to improve layout generation, by passing the keyword arguments as a dict using layout_kwargs in read_pdf(). To know more about the parameters you can tweak, you can check out PDFMiner docs.
First things first…harmonic mixing, mixing in key…they're the same thing.
Just thought I'd clear that one up early because I'm probably going to use those two phrases interchangeably throughout this article.
What Is Harmonic Mixing And Why Should You Care?
Music is written in keys.
Major keys and minor keys mainly. There are others but we're not going to worry about those right now.
Some keys compliment each other and sound great together, while other keys clash with each other and sound pretty bad together.
Camelot 5a Football Playoffs
What you need to know as a DJ is that if you want to blend tracks perfectly, it's better to mix in keys that compliment each other.
But you shouldn't get too hung-up about mixing in key all the time.
Yes, it's important to not totally key clash, but you don't need to make every mix perfectly harmonic. You only need to do that if you're trying to impress other DJ's…most normal people don't care that much.
Finding The Key Of A Track
You can do this three different ways:
Learning how to do this by ear really is a great skill to have and it's actually not that difficult.
You'll need a little bit of kit though…either a piano (should you happen to have one lying around), a synth or a virtual software piano/synth keyboard (what I use).
Very basically, you then set about finding the key on your keyboard that is the best match with the tune that you're playing. I'm not going to turn this post into a tutorial on exactly how to do that because you'll have no trouble finding a better one online than I could write anyway.
Most new DJ's probably find the key of their tracks using a bit of software. There's a few different options out there and increasingly, DJ software packages come with track key identification as a standard feature.
How To Mix In Key
There are three key elementsto mixing in key:
- Finding the key of all your tracks (if not already done for you by your DJ software)
- Labelling each of your tracks with their respective key (if not already done for you by your DJ software)
- Knowing which keys blend together the best
We've already covered how to go about finding the key of your tracks. labelling them accordingly naturally follows on from this.

So now let's look at how to go about finding which musical keys will blend well together.
For this, I recommend using the Camelot Wheel, which is an easy to use, colour-coded system that has the sole purpose of helping you determine which keys are the most compatible.
A picture paints a thousand words, so rather than try to describe it further, here it is:
From the image above, you'll notice that each musical key has been assigned a Camelot key number from one to twelve. And each number has been suffixed with either an A or B.
And this is how to use the Camelot Wheel in two steps:
1. Convert the keys of your tracks into its Camelot key (e.g., 4B, or 12A, or 5B).
2. Now, to find compatible keys, you just need to know that the three immediately adjacent keys on the wheel are the most compatible. So the keys on either side are compatible, and the key either above or below is also compatible. For example, if the track you are playing is in 5B on the Camelot Wheel, mixing into a track in either 6B, 4b or 5A will give you a harmonic blend.
And you're done, mixing in key, how easy was that!
Just keep in mind that when you’re mixing, if you want perfectly harmonic mixes, you have four keys available:
- The same key you're already in
- The key immediately to the left of the key you are already in
- The key immediately to the right of the key you are already in
- The key either above or below the key you are already in
Because the Camelot Wheel is colour-coded, you'll find that you start to remember tracks by their colour-code as well as their numerical key, which is fine because tracks with the same colour code, or close match will be compatible for mixing.
Tools And DJ Software
Don't want to find the keys of your tracks by ear?
Don't want to think too hard about which of your tunes will blend well together in the mix?
Well luckily for you, this is the space-age and you don't have to put any work in if you don't want to.
There is a whole bunch of software out there that can get the job done for you, some better than others obviously.
You can go right from a simple bit of online software that will simply tell you the key of a track, to a whole DJ software package that'll barely stop short of driving you to your gig.
Here's a very small selection of a few different options:
Mixed In Key
Kind of becoming the daddy of harmonic mixing software.
And I've got to say I both love it and hate it.
I love how brilliantly it does what it says it does, I mean it really is pretty flawless.
Mixed In Key will show you the key of your tracks, and, using Camelot EasyMix (so using the Camelot Wheel coding system), identify the tracks in your playlist that will blend well together.
It does a ton of other cools stuff too.
But what I hate it for is it's ‘Energy Level' function.
Yeah it will rank every song in your playlist from 1-10 based on how danceable it is.
I mean, come on! Really?
If you're a DJ and you need a bit of software to tell you the energy level of your tunes, seriously, and I really mean this, sell all your DJ kit today, every last bit of it, and go and take up snooker or something…whatever, just never go near a set of decks again.
Serato DJ
Serato is one the top software brands for digital vinyl DJing, and it also lends itself well to mixing in key.
For quite a while now the Serato DJ software has been able to analyse your music files for track key information. It's both easy to use and accurate.
Within Serato you can use any one of three different key notation systems, including the Camelot Wheel system. And it's colour-coded in the same way.
It's also comes with quite a few advanced key mixing features, such as the ‘energy boost' mix function.
AudioKeyChain
If you want to keep your key detection nice and simple, this might be for you.
This is an online, standalone track key finder.
It searches your music library, identifies the key and tempo of your playlist and identifies compatible tunes.
Jobe done!
There are plenty of other online key detection services, as there are quite a few other DJ software packages that have key finder functionality, such as the popular Ableton Live for example.
Energy Boost Mixing
Dancefloor looking a bit worse for ware? Think they could do with t a bit of a pick-me-up?
Or do you just want to add a big burst of extra excitement to an already energetic floor?
An energy boost mix could be just the thing you're looking for.
All you need to do to pull this off, is mix into a key that is one or two semitones higher than your current track.
If you want to go up by one semitone using the Camelot Wheel coding, simply add seven to the number of your current track.
To go up by two semitones, add two to your current Camelot Wheel coding number.
When, And When Not, To Mix In Key
When all is said and done, reading and understanding the mood and energy of a crowd is the number one most important job of a DJ.
Not slavishly sticking to a bunch of DJ rules.
We've all been to clubs and seen DJ's with flawless technical skills, but the night was still flat. And we've all seen DJ's who's skills weren't quite as sharp as they could be, but they took the roof off.
DJ's who go with their gut-feeling and connect with the dancefloor will always get my vote over techies.
So when the floor is buzzing and you absolutely know that you have the perfect track to drop next, don't disregard it if it's not perfectly in-key.
Chances are that 99% of the time, the tune that jumps into your head as the perfect next track to play, will be in-key anyway.
When should you mix in-key?
When your building the mood of the night, tempting people onto the floor.
And when you're keeping hold of that mood, bringing the floor up to the boil, creating an atmosphere of anticipation.
Summing It All Up
Does it actually matter if your mixing is not completely harmonic all of the time?
There are two types of DJ.
DJ's who love the technical side and love to make their mixes as totally seamless as possible. For these DJ's, mixing in key is going to be super important.
And there are DJ's who are much more interested in the music side, rather than the technical side of DJing. These DJ's don't tend to worry so much about manufacturing each mix so that it's perfectly in key. They probably spend more time finding new music than they do practising their skills.
I tend to prefer to go to clubs where the second type of DJ is playing.
I can forgive some pretty average DJing skills if whoever is behind the decks is playing great, original tunes tunes with love.
Get In Touch
Camelot 5a Series
What you think about mixing in key?
Is it important to you as a DJ? Is it important to you as a clubber?
Camelot 5a Series
I'd love to know what you think so please go ahead and drop your comments directly below.
