Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/309.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 从GitHub API提取的提交与我从git.Repo为同一项目获得的提交不同_Python_Git_Commit_Pull Request - Fatal编程技术网

Python 从GitHub API提取的提交与我从git.Repo为同一项目获得的提交不同

Python 从GitHub API提取的提交与我从git.Repo为同一项目获得的提交不同,python,git,commit,pull-request,Python,Git,Commit,Pull Request,我试图做的是:我想提取在拉请求中修改过的类的名称。为此,我采取以下措施: 从GitHub API: projects = [ {'owner':'google', 'repo':'gson', 'pull_requests': []}] def get(url): global nb PARAMS = { 'client_id': '----my_client_id---', 'client_secret': '---my_client_secret---'

我试图做的是:我想提取在拉请求中修改过的类的名称。为此,我采取以下措施:

从GitHub API:

projects = [ {'owner':'google', 'repo':'gson', 'pull_requests': []}]

def get(url):

    global nb

    PARAMS = {
    'client_id': '----my_client_id---',
    'client_secret': '---my_client_secret---',
    'per_page': 100,
    'state': 'all' #open, closed, all
    }

    result = requests.get(url = url, params=PARAMS)
    nb+=1

    if(not (result.status_code in [200, 304])):
        raise Exception('request error', url, result, result.headers)

    data = result.json()
    while 'next' in result.links.keys():
        result = requests.get(url = result.links['next']['url'], 
    params=PARAMS)
        data.extend(result.json())
        nb+=1

    return data


 def get_pull_requests(repo):

    url = 'https://api.github.com/repos/{}/pulls'.format(repo)

    result = get(url)

    return result

def get_commits(url):

    result = get(url)

    return result

 for i,project in enumerate(projects):
    project['pull_requests'] = 
 get_pull_requests('{}/{}'.format(project['owner'],project['repo']))
    for p in project['pull_requests']:
        p['commits'] = get_commits(p['commits_url'])
    print('{}/{}'.format(project['owner'],project['repo']), ':', 
len(project['pull_requests']))
1) 我提取一个项目的所有请求

2) 我提取每个拉请求的所有提交

3) 对于每个请求,我只保留第一次提交和最后一次提交

因为此时,我不知道如何提取这两个提交请求之间的修改类列表,所以我使用'git'包,如下所示:

我在
D:\\projects\\gson

import git
repo = git.Repo("D:\\projects\\gson") 
commits_list = list(repo.iter_commits())
temp = []
for x in commits_list[0].diff(commits_list[-1]):
    if (x.a_path == x.b_path):
        if x.a_path.endswith('.java'):
            temp.append(x.a_path)
    else:
        if x.b_path.endswith('.java'):
           temp.append(x.b_path)
以下是我如何从GitHub API提取提交:

projects = [ {'owner':'google', 'repo':'gson', 'pull_requests': []}]

def get(url):

    global nb

    PARAMS = {
    'client_id': '----my_client_id---',
    'client_secret': '---my_client_secret---',
    'per_page': 100,
    'state': 'all' #open, closed, all
    }

    result = requests.get(url = url, params=PARAMS)
    nb+=1

    if(not (result.status_code in [200, 304])):
        raise Exception('request error', url, result, result.headers)

    data = result.json()
    while 'next' in result.links.keys():
        result = requests.get(url = result.links['next']['url'], 
    params=PARAMS)
        data.extend(result.json())
        nb+=1

    return data


 def get_pull_requests(repo):

    url = 'https://api.github.com/repos/{}/pulls'.format(repo)

    result = get(url)

    return result

def get_commits(url):

    result = get(url)

    return result

 for i,project in enumerate(projects):
    project['pull_requests'] = 
 get_pull_requests('{}/{}'.format(project['owner'],project['repo']))
    for p in project['pull_requests']:
        p['commits'] = get_commits(p['commits_url'])
    print('{}/{}'.format(project['owner'],project['repo']), ':', 
len(project['pull_requests']))
这两种代码都有效。问题是,我从GitHub API获得了287次提交,但对于同一个项目,从git.Repo获得的提交只有86次。当我尝试匹配这些提交时,少于40个提交匹配

问题:

1) 为什么我在同一个项目中得到不同的承诺

2) 哪一个是正确的,我应该使用

3) 有没有一种方法可以让我知道使用Git.Repo提交的请求是什么

4) 有没有办法在GithubAPI中的两次提交之间提取修改过的类

5) 有没有人知道更好的方法来提取每个拉请求修改过的类

我知道这是一篇很长的帖子,但我在这里尽量说得具体一些。如果您能回答这些问题,我们将不胜感激