Home » Python (page 2)

Python

Exploring Security, Metrics, and Error-handling with gRPC in Python

In my post “Using gRPC in Python,” we wrote a basic gRPC server implementing a users service. We are going to expand on it and explore more gRPC concepts, such as secure client-server communication via self-signed SSL certificates, implementing gRPC middleware (or interceptors), and error handling. We will be using Python 3.6 for our demos in this article. The git ...

Read More »

scikit-learn: Building a multi class classification ensemble

For the Kaggle Spooky Author Identification I wanted to combine multiple classifiers together into an ensemble and found the VotingClassifier that does exactly that. We need to predict the probability that a sentence is written by one of three authors so the VotingClassifier needs to make a ‘soft’ prediction. If we only needed to know the most likely author we ...

Read More »

Modern Python Web Scraping Using Multiple Libraries

In this post, we will talk about Python web scraping and how to scrap web pages using multiple libraries such as Beautifulsoup, Selenium, and some other magic tools like PhantomJS. You’ll learn how to scrap static web pages, Ajax loaded content, iframes, how to handle cookies and much more stuff What is Web Scraping Web scraping is the process of ...

Read More »

Python: Learning about defaultdict’s handling of missing keys

While reading the scikit-learn code I came across a bit of code that I didn’t understand for a while but in retrospect is quite neat. This is the code snippet that intrigued me: vocabulary = defaultdict() vocabulary.default_factory = vocabulary.__len__ Let’s quickly see how it works by adapting an example from scikit-learn: >>> from collections import defaultdict >>> vocabulary = defaultdict() ...

Read More »

Python: Combinations of values on and off

In my continued exploration of Kaggle’s Spooky Authors competition, I wanted to run a GridSearch turning on and off different classifiers to work out the best combination. I therefore needed to generate combinations of 1s and 0s enabling different classifiers. e.g. if we had 3 classifiers we’d generate these combinations 0 0 1 0 1 0 1 0 0 1 ...

Read More »

scikit-learn: Creating a matrix of named entity counts

I’ve been trying to improve my score on Kaggle’s Spooky Author Identification competition, and my latest idea was building a model which used named entities extracted using the polyglot NLP library. We’ll start by learning how to extract entities form a sentence using polyglot which isn’t too tricky: >>> from polyglot.text import Text >>> doc = "My name is David ...

Read More »

Python: polyglot – ModuleNotFoundError: No module named ‘icu’

I wanted to use the polyglot NLP library that my colleague Will Lyon mentioned in his analysis of Russian Twitter Trolls but had installation problems which I thought I’d share in case anyone else experiences the same issues. I started by trying to install polyglot: $ pip install polyglot   ImportError: No module named 'icu' Hmmm I’m not sure what ...

Read More »

Python 3: TypeError: unsupported format string passed to numpy.ndarray.__format__

This post explains how to work around a change in how Python string formatting works for numpy arrays between Python 2 and Python 3. I’ve been going through Kevin Markham‘s scikit-learn Jupyter notebooks and ran into a problem on the Cross Validation one, which was throwing this error when attempting to print the KFold example: Iteration Training set observations Testing ...

Read More »