7. Sort Lexicographically Unordered Columns After Casting to Categorical - Deepstash

Bite-sized knowledge

to upgrade

your career

Ideas from books, articles & podcasts.

7. Sort Lexicographically Unordered Columns After Casting to Categorical

The above sorting using the key parameter can be confusing to some people. Is there a cleaner way? Pandas is arguably the most versatile library for data processing, and you can expect that there is something neat to solve this relatively common problem — converting these lexicographically unordered columns to categorical data.

  • We define a CategoricalDtype by specifying the order of the months.
  • We cast the month column to the new defined category.
  • When we sort the month, it will use the order of the months in the category data definition.

STASHED IN:

5

MORE IDEAS FROM THE SAME ARTICLE

As we’ve seen so far, every sorting is done using the ascending order, which is the default behavior. However, we often want to have the data sorted by a descending order. We can take advantage of the ascending parameter.

In the previous sorting, you may notice that the index goes with each sorted row, which puzzles me sometimes, when I want the sorted DataFrame has an ordered index. In this case, you can either reset the index after sorting, or simply take advantage of the ignore_index parameter, as ...

In the previous sorting, one thing you may have notices is that the sort_values method will create a new DataFrame object, as shown below.

We don’t always need one column for sorting. In many cases, we need to sort the data frame by multiple columns. It’s also simple with sort_values because by doesn’t only take a single column but also a list of columns without any special syntax.

It’s important to remember that your datasets can always contain NANs. Unless you’ve examined your data quality and know that there are no NANs, you should pay attention to that. When we sort values, these NANs are placed behind all the other valid values, by default. If we want to change this de...

Apparently, the sorted data isn’t something that we expect — the months are not in the desired order. To make this happen, we can take advantage of the sort_method taking a key parameter, to which we can pass a custom function for sorting, just like Python’s built-in

In this article, we’ll be using the flights dataset, which records the monthly passenger...

3 Reactions

Comment

It's time to

READ

LIKE

A PRO!

Jump-start your

reading habits

, gather your

knowledge

,

remember what you read

and stay ahead of the crowd!

Takes just 5 minutes a day.


TRY THE DEEPSTASH APP

+2M Installs

4.7 App Score