python數據分析

python數據分析

《Python數據分析(影印版)》由麥金尼撰寫,他是pandas庫的主要作者。《Python數據分析(影印版)》也是一本具有實踐性的指南,指導那些使用Python進行科學計算的數據密集型套用。它適用於剛剛開始使用Python的分析師,或者是進入科學計算領域的Python程式設計師。

基本信息

基本介紹

內容簡介

《Python數據分析(影印版)》內容簡介:你是否在尋找一本完整介紹Python操縱、處理、提取和壓縮結構化數據的指南?《Python數據分析(影印版)》包含了許多實例分析,通過若干個Python庫——包括NumPy,pandas,matplotlib和IPython——為你展示了如何高效地解決大量數據分析的問題。

作者簡介

作者:(美國)麥金尼(Wes McKinney)

圖書目錄

Preface

1.Preliminaries

What Is This Book About?

Why Python for Data Analysis?

Python as Glue

Solving the "Two—Language" Problem

Why Not Python?

Essential Python Libraries

NumPy

pandas

matplotlib

IPython

SciPy

Installation and Setup

Windows

Apple OS X

GNU/Linux

Python 2 and Python 3

Integrated Development Environments (IDEs)

Community and Conferences

Navigating This Book

Code Examples

Data for Examples

Import Conventions

Jargon

Acknowledgements

2.Introductory Examples

1.usa.gov data from bit.ly

Counting Time Zones in Pure Python

Counting Time Zones with pandas

MovieLens 1M Data Set

Measuring rating disagreement

US Baby Names 1880—2010

Analyzing Naming Trends

Conclusions and The Path Ahead

3.IPython:An Interactive Computing and Development Environment

IPython Basics

Tab Completion

Introspection

The %run Command

Executing Code from the Clipboard

Keyboard Shortcuts

Exceptions and Tracebacks

Magic Commands

Qt—based Rich GUI Console

Matplotlib Integration and Pylab Mode

Using the Command History

Searching and Reusing the Command History

Input and Output Variables

Logging the Input and Output

Interacting with the Operating System

Shell Commands and Aliases

Directory Bookmark System

Software Development Tools

Interactive Debugger

Timing Code: %time and %timeit

Basic Profiling: %prun and %run —p

Profiling a Function Line—by—Line

IPython HTML Notebook

Tips for Productive Code Development Using IPython

Reloading Module Dependencies

Code Design Tips

Advanced IPython Features

Making Your Own Classes IPython—friendly

Profiles and Configuration

Credits

4.NumPy Basics:Arrays and Vectorized Computation

The NumPy ndarray: A Multidimensional Array Object

Creating ndarrays

Data Types for ndarrays

Operations between Arrays and Scalars

Basic Indexing and Slicing

Boolean Indexing

Fancy Indexing

Transposing Arrays and Swapping Axes

Universal Functions: Fast Element—wise Array Functions

Data Processing Using Arrays

Expressing Conditional Logic as Array Operations

Mathematical and Statistical Methods

Methods for Boolean Arrays

Sorting

Unique and Other Set Logic

File Input and Output with Arrays

Storing Arrays on Disk in Binary Format

Saving and Loading Text Files

Linear Algebra

Random Number Generation

Example: Random Walks

Simulating Many Random Walks at Once

5.Getting Started with pandas

Introduction to pandas Data Structures

Series

DataFrame

Index Objects

Essential Functionality

Reindexing

Dropping entries from an axis

Indexing, selection, and filtering

Arithmetic and data alignment

Function application and mapping

Sorting and ranking

Axis indexes with duplicate values

Summarizing and Computing Descriptive Statistics

Correlation and Covariance

Unique Values, Value Counts, and Membership

Handling Missing Data

Filtering Out Missing Data

Filling in Missing Data

Hierarchical Indexing

Reordering and Sorting Levels

Summary Statistics by Level

Using a DataFrame's Columns

Other pandas Topics

Integer Indexing

Panel Data

5.Data Loading, Storage, and File Formats

Reading and Writing Data in Text Format

Reading Text Files in Pieces

Writing Data Out to Text Format

Manually Working with Delimited Formats

JSON Data

XML and HTML: Web Scraping

Binary Data Formats

Using HDF5 Format

Reading Microsoft Excel Files

Interacting with HTML and Web APIs

Interacting with Databases

Storing and Loading Data in MongoDB

7.Data Wrangling: Clean, Transform, Merge, Reshape

Combining and Merging Data Sets

Database—style DataFrame Merges

Merging on Index

Concatenating Along an Axis

Combining Data with Overlap

Reshaping and Pivoting

Reshaping with Hierarchical Indexing

Pivoting "long" to "wide" Format

Data Transformation

Removing Duplicates

Transforming Data Using a Function or Mapping

Replacing Values

Renaming Axis Indexes

Discretization and Binning

Detecting and Filtering Outliers

Permutation and,Random Sampling

Computing Indicator/Dummy Variables

String Manipulation

String Object Methods

Regular expressions

Vectorized string functions in pandas

Example: USDA Food Database

……

8.Plotting and Visualization

9.Data Aggregation and Group Operations

10.Time Series

11.Financial and Economic Data Applications

12.Advanced NumPy

Appendix:Python Language Essentials

Index

名人推薦

科學和數據分析領域已經等了本書好幾年了:具有具體的實用建議以及如何聚沙成塔的見解。它應該會成為接下來若干年裡Python科學計算方面的經典參考資料。”

——Fernando Perez UC Berkeley大學的助理 研究員,也是IPython的原創作者之一

相關詞條

相關搜尋

熱門詞條

聯絡我們