Python
  • Introduction
  • Chapter 1.Notes from research
    • 1.Introduction of Python
    • 2. Build developer environment
      • 2.1.Sublime Text3
      • 2.2.Jupyter(IPython notebook)
        • 2.2.1.Introduction
        • 2.2.2.Basic usage
        • 2.2.3.some common operations
      • 2.3.Github
        • 2.3.1.Create Github account
        • 2.3.2.Create a new repository
        • 2.3.3.Basic operations: config, clone, push
      • 2.4.Install Python 3.4 in Windows
    • 3. Write Python code
      • 3.1.Hello Python
      • 3.2.Basic knowledges
      • 3.3.撰寫獨立python程式
      • 3.4.Arguments parser
      • 3.5.Class
      • 3.6.Sequence
    • 4. Web crawler
      • 4.1.Introduction
      • 4.2.requests
      • 4.3.beautifulSoup4
      • 3.4.a little web crawler
    • 5. Software testing
      • 5.1. Robot Framework
        • 1.1.Introduction
        • 1.2.What is test-automation framework?
        • 1.3.Robot Framework Architecture
        • 1.4.Robot Framework Library
        • 1.5.Reference
    • 6. encode/ decode
      • 6.1.編碼/解碼器的基本概念
      • 6.2.常見的編碼/ 解碼錯誤訊息與其意義
      • 6.3 .處理文字檔案
    • 7. module
      • 7.1.Write a module
      • 7.2.Common module
        • 7.2.1.sched
        • 7.2.2.threading
    • 8. Integrate IIS with django
      • 8.1.Integrate IIS with django
  • Chapter 2.Courses
    • 2.1.Python for Data Science and Machine Learning Bootcamp
      • 2.1.1.Virtual Environment
      • 2.1.2.Python crash course
      • 2.1.3.Python for Data Analysis - NumPy
        • 2.1.3.1.Numpy arrays
        • 2.1.3.2.Numpy Array Indexing
        • 2.1.3.3.Numpy Operations
      • 2.1.4.Python for Data Analysis - Pandas
        • 2.1.4.1.Introduction
        • 2.1.4.2.Series
        • 2.1.4.3.DataFrames
        • 2.1.4.4.Missing Data
        • 2.1.4.5.GroupBy
        • 2.1.4.6.Merging joining and Concatenating
        • 2.1.4.7.Data input and output
      • 2.1.5.Python for Data Visual Visualization - Pandas Built-in Data Visualization
      • 2.1.6.Python for Data Visualization - Matplotlib
        • 2.1.6.1.Introduction of Matplotlib
        • 2.1.6.2.Matplotlib
      • 2.1.7.Python for Data Visualization - Seaborn
        • 2.1.7.1.Introduction to Seaborn
        • 2.1.7.2.Distribution Plots
        • 2.1.7.3.Categorical Plots
        • 2.1.7.4.Matrix Plots
        • 2.1.7.5.Grids
        • 2.1.7.6.Regression Plots
      • 2.1.8. Python for Data Visualization - Plotly and Cufflinks
        • 2.1.8.1.Introduction to Plotly and Cufflinks
        • 2.1.8.2.Plotly and Cufflinks
      • 2.1.9. Python for Data Visualization - Geographical plotting
        • 2.1.9.1.Choropleth Maps - USA
        • 2.1.9.2.Choropleth Maps - World
      • 2.1.10.Combine data analysis and visualization to tackle real world data sets
        • 911 calls capstone project
      • 2.1.11.Linear regression
        • 2.1.11.1.Introduction to Scikit-learn
        • 2.1.11.2.Linear regression with Python
      • 2.1.12.Logistic regression
        • 2.1.12.1.Logistic regression Theory
        • 2.1.12.2.Logistic regression with Python
      • 2.1.13.K Nearest Neighbors
        • 2.1.13.1.KNN Theory
        • 2.1.13.2.KNN with Python
      • 2.1.14.Decision trees and random forests
        • 2.1.14.1.Introduction of tree methods
        • 2.1.14.2.Decision trees and Random Forests with Python
      • 2.1.15.Support Vector Machines
      • 2.1.16.K means clustering
      • 2.1.17.Principal Component Analysis
    • 2.2. Machine Learning Crash Course Jam
Powered by GitBook
On this page
  • Concatenation
  • Merge
  • Join

Was this helpful?

  1. Chapter 2.Courses
  2. 2.1.Python for Data Science and Machine Learning Bootcamp
  3. 2.1.4.Python for Data Analysis - Pandas

2.1.4.6.Merging joining and Concatenating

Previous2.1.4.5.GroupByNext2.1.4.7.Data input and output

Last updated 5 years ago

Was this helpful?

Concatenation

  • 注意: Dataframe的維度必須相同

pd.concate([df1, df2, df3])

pd.concate([df1, df2, df3], axis = 1)

Merge

  • 只能做橫向合併

  • 預設的merge的方式為

    • pd.merge(left, right, how = 'inner', on = 'key')
    • 指定多個key進行inner join

      pd.merge(left, right, on = ['key1', 'key2'])
    • pd.merge(left, right, how = 'right', on = ['key1', 'key2'])
    • pd.merge(left, right, how = 'left', on = ['key1', 'key2'])
    • pd.merge(left, right, how = 'outer', on = ['key1', 'key2'])
  • 如果除了key以外有其他的column也是同名的, 將會採用[列名_表名]作為新的column值

left = pd.DataFrame({'key': ['K0', 'K1', 'K2', 'K3'],
                     'C': ['A0', 'A1', 'A2', 'A3'],
                     'D': ['B0', 'B1', 'B2', 'B3']})

right = pd.DataFrame({'key': ['K0', 'K1', 'K2', 'K3'],
                          'C': ['C0', 'C1', 'C2', 'C3'],
                          'D': ['D0', 'D1', 'D2', 'D3']}) 

pd.merge(left,right,how='inner',on='key')
OUT:     
       C_x    D_x    key    C_y    D_y
0    A0    B0    K0    C0    D0
1    A1    B1    K1    C1    D1
2    A2    B2    K2    C2    D2
3    A3    B3    K3    C3    D3

Join

  • 可以做橫向或縱向合併

left = pd.DataFrame({'A': ['A0', 'A1', 'A2'],
                     'B': ['B0', 'B1', 'B2']},
                      index=['K0', 'K1', 'K2']) 

right = pd.DataFrame({'C': ['C0', 'C2', 'C3'],
                    'D': ['D0', 'D2', 'D3']},
                      index=['K0', 'K2', 'K3'])

left.join(right, how='outer')
inner
Inner join: 找出table間有共同key值的資料
right join: 以right的key為主, 加入left
left join: 以left的key為主, 加入right
outer join: left與right的left join table跟left與right的right join table 做union