Skip to content

Latest commit

 

History

History
489 lines (340 loc) · 14.4 KB

File metadata and controls

489 lines (340 loc) · 14.4 KB

Python

NOT FULLY PORTED YET.

Python is a popular and easy to use general purpose programming language that is heavily used in Data Analytics and Data Science as well as systems administration.

It's not as amazing for one-liners as Perl is though, which can boost shell scripts more easily.

Core Reading

Learning Python

DevOps Python tools

HariSekhon/DevOps-Python-tools

Readme Card

Shell scripts with Python

Shell scripts using Python and making it easier to install Python pip libraries from PyPI.

HariSekhon/DevOps-Bash-tools

Readme Card

Nagios Plugins in Python

HariSekhon/Nagios-Plugins

Readme Card

Python Library with Unit Tests

HariSekhon/pylib

VirtualEnv

Creates a virtual environment in the local given sub-directory in which to install PyPI modules to avoid clashes with system python libraries.

virtualenv "$directory_name_to_create"

I like top always the directory name venv for the virtualenv:

virtualenv venv

Then to use it before you starting pip installing:

source venv/bin/activate

This prepends to $PATH to use the bin/python and lib/python-3.12/site-packages under the local venv directory:

Now install PyPI modules as usual.

The venv/pyvenv.cfg file will contain some metadata like this:

home = /opt/homebrew/Cellar/python@3.12/3.12.3/bin
implementation = CPython
version_info = 3.12.3.final.0
virtualenv = 20.25.3
include-system-site-packages = false
base-prefix = /opt/homebrew/Cellar/python@3.12/3.12.3/Frameworks/Python.framework/Versions/3.12
base-exec-prefix = /opt/homebrew/Cellar/python@3.12/3.12.3/Frameworks/Python.framework/Versions/3.12
base-executable = /opt/homebrew/Cellar/python@3.12/3.12.3/Frameworks/Python.framework/Versions/3.12/bin/python3.12

Pipenv

https://pipenv.pypa.io/en/latest/

:octocat: pypa/pipenv

Combines Pip and VirtualEnv into one command.

brew install pipenv

Creates a Pipfile and Pipfile.lock, plus a virtualenv in a standard location $HOME/.local/share/virtualenvs/ if not already inside one.

pipenv install

Activates the virtualenv

pipenv shell

Automatically converts a requirements.txt file into a Pipfile:

pipenv check

Dependency graph:

pipenv graph

Poetry

:octocat: python-poetry/poetry

Replaces pip requirements.txt for PyPI library management with a simple pyproject.yaml:

Writes a lockfile to save versions like npm and go mod do.

Jupyter Notebook

(formerly called IPython Notebook)

https://ipython.org/notebook.html

Interactive web page where you can mix code blocks, rich notes and graphs on the same page, click to execute code blocks and form a page oriented workflow of results and analysis for sharing and demonstrating.

Libraries

You can search for libraries at pypi.org.

Some libraries you may find useful are below.

You can see most of these used throughout my GitHub repos, eg:

General

  • GitPython - Git
  • sh - execute shell commands more easily
  • jinja2 - Jinja2 templating
  • humanize - converts units to human readable
  • pyobjc-framework-Quartz - control Mac UI
  • psutil
  • PyInstaller - bundle Python code into standalone executablers (doesn't work for advanced code)
  • sasl

Web

  • requests - easy HTTP request library
  • beautifulsoup4 - HTML parsing library
  • Scrapy - web scraping
  • pycookiecheat - use or extract cookies from your browser's cookie jar to query websites directly or using curl
  • selenium - Selenium web testing framework

Databases

Cloud

  • boto3 - AWS
  • aws-consoler

CI/CD & Linting

  • python-jenkins - Jenkins
  • TravisPy - for Travis CI
  • pylint - Python linting CLI tool
  • grip - Grip renders local markdown using a local webserver
  • Markdown
  • MarkupSafe
  • checkov
  • semgrep - security / misconfiguration scanning
  • jsonlint
  • yamllint - CLI YAML linting tool

Unit Testing

  • unittest2
  • nose
  • Faker - generate fake but realistic data for unit testing, Python version of the original Perl library, comes with a faker command convenient for shell scripts:

Generate 10 fake addresses:

faker -r 10 address

Virtualization & Containerization

Pub/Sub

Big Data & NoSQL

Data Formats & Analysis

  • avro - Avro
  • ldif3 - LDAP LDIF format
  • jsonlint
  • Markdown
  • MarkupSafe
  • numpy - NumPy for scientific numeric processing
  • pandas - Pandas for data analysis
  • python-cson
  • pyarrow - Apache Arrow and Parquet support, but Parquet support in this is weak, prefer Parquet Tools
  • python-ldap
  • python-snappy - work with Snappy compression format, often pulled as a dependency
  • PyYAML - work with YAML files in Python
  • sciki-learn - SciKit Learn
  • toml
  • xmltodict
  • yamllint - CLI YAML linting tool
  • Faker - generate fake but realistic data for unit testing, Python version of the original Perl library, comes with a faker command convenient for shell scripts:

Generate 10 fake addresses:

faker -r 10 address

Data Visualization

  • matplotlib - General-purpose plotting, highly customizable
  • seaborn - built on matplotlib, higher level to make it easier to great aesthetic visualizations
  • plotly - Interactive graphs, dashboards, 3D plots
  • bokeh - Interactive, web-ready visualizations
  • pandas - Quick and easy plots directly from dataframes
  • networkx - Graph theory, network analysis
  • altair - Declarative statistical visualizations
  • pygal - Vector (SVG) visualizations, interactive
  • graph-tool - Scalable and efficient for large graph analysis

Jython

https://www.jython.org/

Python on the Java JVM.

The ease of Python coding with full access to Java APIs and libraries.

Useful when there aren't Python libraries available or they aren't as fully featured as the Java versions (eg. for Hadoop).

Today, I'd prefer to write in the native JVM language Groovy.

Install

From DevOps-Python-tools:

jython_install.sh

Run

Interactive REPL:

$ jython
Jython 2.7.3 (tags/v2.7.3:5f29801fe, Sep 10 2022, 18:52:49)
[OpenJDK 64-Bit Server VM (Eclipse Adoptium)] on java17.0.1
Type "help", "copyright", "credits" or "license" for more information.
>>>

Run a Jython script and add Java classpath to find any jar dependencies that the script uses:

jython -J-cp "$CLASSPATH" "file.py"

Code

Some Jython programs, such as those using Hadoop HDFS Java API can be found in the DevOps-Python-tools repo.

Python Hosting Sites

Hosted Python WebApps

Hosted Jupyter Notebooks

  • Google Collab - Jupyter Notebooks in the cloud with free access to GPUs
  • Binder - run Jupyter Notebooks from GitHub repos

Other Hosted Python

  • Replit - cloud-base IDE with AI to generate code from ideas
  • Glitch - good for prototyping small webapps
  • fly.io - code execution sandbox, runs any Docker image
  • Code Ocean - for scientific bioinformatics and R&D

Troubleshooting

Python Fault Handler

Prints stack trace on crash.

Useful for debugging native-level crashes, C extensions, system calls, OS signals.

Activates handling of signals like:

Signal Description
SIGSEGV Segmentation fault
SIGFPE Floating-point exception
SIGBUS Bus error
SIGABRT Abort signal
SIGILL Illegal instruction

Normally, these signals cause Python to crash without much useful information, but with the fault handler enabled, it'll output a traceback before the crash to help debug.

Minimal performance overhead, but bigger logs, and possibly dumps sensitive info.

Enable Python Fault Handler

export PYTHONFAULTHANDLER=1

or

python -X faulthandler "file.py"

or

import faulthandler
faulthandler.enable()

Alpine ModuleNotFoundError: No module named 'pip._vendor.six.moves'

ModuleNotFoundError: No module named 'pip._vendor.six.moves'

Fix:

apk del py3-pip py-pip
apk add py3-pip

Small vs Big Integers - is vs ==

Consider:

> a = 10
> b = 10
> a is b
False
> a = 500
> b = 500
> a is b
True

This is due to caching small integer objects but not integers over 256.

This is rarely an issue in practice though since the == comparison operator works as expected and most people will only use that.

Instead, people should try to maintain open source Python over a decade of bloody code changes and trying to keep it working in different CI/CD systems to try to retain portability across different environments...

Python maintainability makes Java Null Pointer Exceptions look like the cheap “billion dollar mistake”.

Meme

Programming Python

Programming Python

Porting Your Language to the JVM

I wish I had discovered Groovy before Jython...

Porting Your Language to the JVM

ChatGPT Python - What I Expected vs What I Got

ChatGPT Python - What I Expected vs What I Got

Imported Package Tariffs

Imported Package Tariffs

Friend Showing C++ Code

Friend Showing C++ Code

Lines of Code vs Understanding Methods

Lines of Code vs Understanding Methods

Partial port from private Knowledge Base page 2008+