操屁眼的视频在线免费看,日本在线综合一区二区,久久在线观看免费视频,欧美日韩精品久久综

新聞資訊

    ngress 案例實戰

    一、基本配置

    apiVersion: networking.k8s.io/v1
    kind: Ingress
    metadata:
      name: itlanson-ingress
      namespace: default
    spec:
      rules:
      - host: itlanson.com
        http:
          paths:
          - path: /
            pathType: Prefix
            backend:  ## 指定需要響應的后端服務
              service:
                name: my-nginx-svc  ## kubernetes集群的svc名稱
                port:
                  number: 80  ## service的端口號



    • pathType 詳細:
    • Prefix:基于以 / 分隔的 URL 路徑前綴匹配。匹配區分大小寫,并且對路徑中的元素逐個完成。 路徑元素指的是由 / 分隔符分隔的路徑中的標簽列表。 如果每個 p 都是請求路徑 p 的元素前綴,則請求與路徑 p 匹配。
    • Exact:精確匹配 URL 路徑,且區分大小寫。
    • ImplementationSpecific:對于這種路徑類型,匹配方法取決于 IngressClass。 具體實現可以將其作為單獨的 pathType 處理或者與 PrefixExact 類型作相同處理。

    ingress 規則會生效到所有按照了 IngressController 的機器的 nginx 配置。


    二、默認后端

    apiVersion: networking.k8s.io/v1
    kind: Ingress
    metadata:
      name: itlanson-ingress
      namespace: default
    spec:
      defaultBackend:  ## 指定所有未匹配的默認后端
        service:
          name: php-apache
          port: 
            number: 80
      rules:
      - host: itlanson.com
        http:
          paths:
          - path: /abc
            pathType: Prefix
            backend:
              service:
                name: my-nginx-svc
                port:
                  number: 80



    效果

    itlanson.com 下的 非 /abc 開頭的所有請求,都會到 defaultBackend

    非 itlanson.com 域名下的所有請求,也會到 defaultBackend

    nginx 的全局配置

    kubectl edit cm ingress-nginx-controller -n  ingress-nginx



    編輯配置加上

    data: 配置項: 配置值 所有配置項參考

    https://kubernetes.github.io/ingress-nginx/user-guide/nginx-configuration/configmap基于環境變量帶去的


    三、路徑重寫

    Rewrite - NGINX Ingress Controller

    Rewrite 功能,經常被用于前后分離的場景

    • 前端給服務器發送 / 請求映射前端地址。
    • 后端給服務器發送 /api 請求來到對應的服務。但是后端服務沒有 /api 的起始路徑,所以需要 ingress-controller 自動截串
    apiVersion: networking.k8s.io/v1
    kind: Ingress
    metadata:
      annotations:  ## 寫好annotion
      #https://kubernetes.github.io/ingress-nginx/user-guide/nginx-configuration/annotations/
        nginx.ingress.kubernetes.io/rewrite-target: /$2  ### 只保留哪一部分
      name: rewrite-ingress-02
      namespace: default
    spec:
      rules:  ## 寫好規則
      - host: itlanson.com
        http:
          paths:
          - backend:
              service: 
                name: php-apache
                port: 
                  number: 80
            path: /api(/|$)(.*)
            pathType: Prefix



    四、配置 SSL

    TLS/HTTPS - NGINX Ingress Controller

    生成證書:(也可以去青云申請免費證書進行配置)

    $ openssl req -x509 -nodes -days 365 -newkey rsa:2048 -keyout ${KEY_FILE:tls.key} -out ${CERT_FILE:tls.cert} -subj "/CN=${HOST:itlanson.com}/O=${HOST:itlanson.com}"
    
    
    kubectl create secret tls ${CERT_NAME:itlanson-tls} --key ${KEY_FILE:tls.key} --cert ${CERT_FILE:tls.cert}
    
    
    
    
    ## 示例命令如下
    openssl req -x509 -nodes -days 365 -newkey rsa:2048 -keyout tls.key -out tls.cert -subj "/CN=it666.com/O=it666.com"
    
    
    kubectl create secret tls it666-tls --key tls.key --cert tls.cert



    apiVersion: v1
    data:
      tls.crt: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSURJekNDQWd1Z0F3SUJBZ0lKQVB6YXVMQ1ZjdlVKTUEwR0NTcUdTSWIzRFFFQkN3VUFNQ2d4RWpBUUJnTlYKQkFNTUNXbDBOalkyTG1OdmJURVNNQkFHQTFVRUNnd0phWFEyTmpZdVkyOXRNQjRYRFRJeE1EVXhNREV5TURZdwpNRm9YRFRJeU1EVXhNREV5TURZd01Gb3dLREVTTUJBR0ExVUVBd3dKYVhRMk5qWXVZMjl0TVJJd0VBWURWUVFLCkRBbHBkRFkyTmk1amIyMHdnZ0VpTUEwR0NTcUdTSWIzRFFFQkFRVUFBNElCRHdBd2dnRUtBb0lCQVFDbkNYa0wKNjdlYzNjYW5IU1V2VDR6YXZmMGpsOEFPWlBtUERhdUFRTElEby80LzlhV2JPSy9yZm5OelVXV3lTRFBqb3pZVApWa2xmQTZYRG1xRU5FSWRHRlhjdExTSlRNRkM5Y2pMeTlwYVFaaDVYemZId0ZoZXZCR1J3MmlJNXdVdk5iTGdWCmNzcmRlNXlKMEZYOFlMZFRhdjhibzhjTXpxN2FqZXhXMWc1dkxmTWZhczAvd2VyVk9Qc0ZmS3RwZ1dwSWMxMXEKekx6RnlmWHNjcVNhVTV2NFo5WHFqQjRtQjhZZ043U2FSa2pzU0VsSFU4SXhENEdTOUtTNGtkR2xZak45V2hOcAp6aG5MdllpSDIrZThQWE9LdU8wK2Jla1MrS3lUS2hnNnFWK21kWTN0MWJGenpCdjFONTVobTNQTldjNk9ROTh3CkYrQk9uUUNhWExKVmRRcS9BZ01CQUFHalVEQk9NQjBHQTFVZERnUVdCQlNzSUFvMHZ4RFZjVWtIZ1V1TFlwY0wKdjBFSERqQWZCZ05WSFNNRUdEQVdnQlNzSUFvMHZ4RFZjVWtIZ1V1TFlwY0x2MEVIRGpBTUJnTlZIUk1FQlRBRApBUUgvTUEwR0NTcUdTSWIzRFFFQkN3VUFBNElCQVFDSjFEdGJoQnBacTE1ODVEMGlYV1RTdmU3Q2YvQ3VnakxZCjNYb2gwSU9sNy9mVmNndFJkWXlmRFBmRDFLN0l4bElETWtUbTVEVWEyQzBXaFY5UlZLU0poSTUzMmIyeVRGcm8Kc053eGhkcUZpOC9CU1lsQTl0Tk5HeXhKT1RKZWNtSUhsaFhjRlEvUzFaK3FjVWNrTVh6UHlIcFl0VjRaU0hheQpFWVF2bUVBZTFMNmlnRk8wc2xhbUllTFBCTWhlTDNnSDZQNlV3TVpQbTRqdFR1d2FGSmZGRlRIakQydmhSQkJKCmZjTGY5QjN3U3k2cjBDaXF2VXQxQUNQVnpSdFZrcWJJV1d5VTBDdkdjVDVIUUxPLzdhTE4vQkxpNGdYV2o1MUwKVXdTQzhoY2xodVp3SmRzckNkRlltcjhTMnk0UDhsaDdBc0ZNOGorNjh1ZHJlYXovWmFNbwotLS0tLUVORCBDRVJUSUZJQ0FURS0tLS0tCg==tls.key: LS0tLS1CRUdJTiBQUklWQVRFIEtFWS0tLS0tCk1JSUV2QUlCQURBTkJna3Foa2lHOXcwQkFRRUZBQVNDQktZd2dnU2lBZ0VBQW9JQkFRQ25DWGtMNjdlYzNjYW4KSFNVdlQ0emF2ZjBqbDhBT1pQbVBEYXVBUUxJRG8vNC85YVdiT0svcmZuTnpVV1d5U0RQam96WVRWa2xmQTZYRAptcUVORUlkR0ZYY3RMU0pUTUZDOWNqTHk5cGFRWmg1WHpmSHdGaGV2QkdSdzJpSTV3VXZOYkxnVmNzcmRlNXlKCjBGWDhZTGRUYXY4Ym84Y016cTdhamV4VzFnNXZMZk1mYXMwL3dlclZPUHNGZkt0cGdXcEljMTFxekx6RnlmWHMKY3FTYVU1djRaOVhxakI0bUI4WWdON1NhUmtqc1NFbEhVOEl4RDRHUzlLUzRrZEdsWWpOOVdoTnB6aG5MdllpSAoyK2U4UFhPS3VPMCtiZWtTK0t5VEtoZzZxVittZFkzdDFiRnp6QnYxTjU1aG0zUE5XYzZPUTk4d0YrQk9uUUNhClhMSlZkUXEvQWdNQkFBRUNnZ0VBTDZ0Tlp6Q0MrdnB6cWRkd2VEcjhtS1JsckpXdkVxeVFaOW5mMnI4Ynpsd3IKdi9jTHB1dWJrTnBLZWx0OWFVNmZ1RlFvcDRZVmRFOG5MRlpocGNmVXd4UjNLV1piQ0dDZWVpSXdGaFIzVFloSApHb25FaE43WkxYSlVjN3hjemh5eTFGSTFpckZ5NFpoWVNTQXltYzdFSXNORFFKRVJ5ajdsdWF1TkNnOFdtWFdPCmd0OHIzZHVTazNHV2ZZeGdWclFZSHlGTVpCbUpvNDliRzVzdGcwR01JNUZRQXord3RERlIyaWk2NkVkNzBJOUwKYXJNMHpQZkM3Tk1acmhEcHVseVdVYWNXRDY1V1g1Yys5TnpIMW15MEVrbjJGOWQzNXE1czZRakdTVElMVXlhbwpJUVl5bGU0OVdKdlV4YjN2YTZ1OTVBUHAyWFFVaFEyS09GcGxabncwTVFLQmdRRFN2cDAzYlBvQVlEb3BqWGlxCndxemxKdk9IY2M4V3ZhVytoM0tvVFBLZ1dRZWpvVnNZTFEzM2lMeXdFY0FXaWtoSzE2UjVmTkt5VUFRZ2JDNm4KNTdkcUJ3L1RqYlV2UGR6K0llMnNKN1BlSlpCQktXZUNHNjBOeGgzUDVJcSsxRHVjdExpQTBKdVZyOUlaUzdqSApJOVpUMitDMTNlNkRlZkJaajFDb0ZhemJ1UUtCZ1FESzZCaVkzSk5FYVhmWVpKUzh1NFViVW9KUjRhUURBcmlpCjFGRlEzMDFPOEF0b1A2US9IcjFjbTdBNGZkQ3JoSkxPMFNqWnpldnF4NEVHSnBueG5pZGowL24yTHE3Z2x6Q2UKbVlKZFVVVFo0MkxJNGpWelBlUk1RaGhueW9CTHpmaEFYcEtZSU1NcmpTd1JUcnYyclRpQkhxSEZRbDN6YngvKwptcjdEVWtlR053S0JnRllPdEpDUGxiOVZqQ3F2dEppMmluZkE0aTFyRWcvTlBjT0IrQlkxNWRZSXhRL1NzaW83Cks3cnJRWEg4clo0R3RlS3FFR1h6ek80M3NwZXkxWktIRXVUZklWMVlQcWFkOG9Kc1JHdktncTZ5VkNmbnluYmMKNmx2M2pQRDUrSlpZZ0VkTG5SUXRHM3VTb283bDF2eXE2N2l1enlJMUVGTHNGblBjRENtM1FERXhBb0dBSDQrdQprOGhybDg2WDk2N2RlK1huTkhMSEZwbDBlNHRtME4wWnNPeXJCOFpLMy9KV1NBTXVEVU9pUzRjMmVCZHRCb0orClNqSy9xWXRTeEhRb3FlNmh6ZU5oRkN2Nnc3Q0F2WXEvUG1pdnZ2eWhsd0dvc3I1RHpxRFJUd091cFJ2cXE0aUsKWU9ObnVGU0RNRVlBOHNQSzhEcWxpeHRocGNYNVFnOHI4UkhSVWswQ2dZQlF3WFdQU3FGRElrUWQvdFg3dk1mTwp3WDdWTVFMK1NUVFA4UXNRSFo2djdpRlFOL3g3Vk1XT3BMOEp6TDdIaGdJV3JzdkxlV1pubDh5N1J3WnZIbm9zCkY3dkliUm00L1Y1YzZHeFFQZXk5RXVmWUw4ejRGMWhSeUc2ZjJnWU1jV25NSWpnaUh2dTA3cStuajFORkh4YVkKa2ZSSERia01YaUcybU42REtyL3RtQT09Ci0tLS0tRU5EIFBSSVZBVEUgS0VZLS0tLS0K
    kind: Secret
    metadata:
      creationTimestamp: "2022-06-10T12:06:22Z"
      name: it666-tls
      namespace: default
      resourceVersion: "2264722"
      uid: 16f8a4b6-1600-4ded-8458-b0480ce075ba
    type: kubernetes.io/tls



    配置域名使用證書

    apiVersion: networking.k8s.io/v1
    kind: Ingress
    metadata:
      name: itlanson-ingress
      namespace: default
    spec:
      tls:
       - hosts:
         - itlanson.com
         secretName: itlanson-tls
      rules:
      - host: itlanson.com
        http:
          paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: my-nginx-svc
                port:
                  number: 80

    配置好證書,訪問域名,就會默認跳轉到 https

    ?

    五、限速

    Annotations - NGINX Ingress Controller

    apiVersion: networking.k8s.io/v1
    kind: Ingress
    metadata:
      name: ingress-222333
      namespace: default
      annotations:  ##注解
        nginx.ingress.kubernetes.io/limit-rps: "1"   ### 限流的配置
    spec:
      defaultBackend: ## 只要未指定的映射路徑
        service:
          name: php-apache
          port:
            number: 80
      rules:
      - host: it666.com
        http:
          paths:
          - path: /bbbbb
            pathType: Prefix
            backend:
              service:
                name: cluster-service-222
                port:
                  number: 80



    六、灰度發布-Canary

    以前可以使用 k8s 的 Service 配合 Deployment 進行金絲雀部署。原理如下



    缺點:

    • 不能自定義灰度邏輯,比如指定用戶進行灰度

    現在可以使用 Ingress 進行灰度。原理如下



    ## 使用如下文件部署兩個service版本。v1版本返回nginx默認頁,v2版本返回 11111
    apiVersion: v1
    kind: Service
    metadata:
      name: v1-service
      namespace: default
    spec:
      selector:
        app: v1-pod
      type: ClusterIP
      ports:
      - name: http
        port: 80
        targetPort: 80
        protocol: TCP
    ---
    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name:  v1-deploy
      namespace: default
      labels:
        app:  v1-deploy
    spec:
      selector:
        matchLabels:
          app: v1-pod
      replicas: 1
      template:
        metadata:
          labels:
            app:  v1-pod
        spec:
          containers:
          - name:  nginx
            image:  nginx
    ---
    apiVersion: v1
    kind: Service
    metadata:
      name: canary-v2-service
      namespace: default
    spec:
      selector:
        app: canary-v2-pod
      type: ClusterIP
      ports:
      - name: http
        port: 80
        targetPort: 80
        protocol: TCP
    ---
    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name:  canary-v2-deploy
      namespace: default
      labels:
        app:  canary-v2-deploy
    spec:
      selector:
        matchLabels:
          app: canary-v2-pod
      replicas: 1
      template:
        metadata:
          labels:
            app:  canary-v2-pod
        spec:
          containers:
          - name:  nginx
            image:  registry.cn-hangzhou.aliyuncs.com/lanson_k8s_images/nginx-test:env-msg



    七、會話保持-Session 親和性

    https://kubernetes.github.io/ingress-nginx/user-guide/nginx-configuration/annotations/#session-affinityhttps://kubernetes.github.io/ingress-nginx/user-guide/nginx-configuration/annotations/#session-affinity

    第一次訪問,ingress-nginx 會返回給瀏覽器一個 Cookie,以后瀏覽器帶著這個 Cookie,保證訪問總是抵達之前的 Pod;

    ?

    ## 部署一個三個Pod的Deployment并設置Service
    apiVersion: v1
    kind: Service
    metadata:
      name: session-affinity
      namespace: default
    spec:
      selector:
        app: session-affinity
      type: ClusterIP
      ports:
      - name: session-affinity
        port: 80
        targetPort: 80
        protocol: TCP
    ---
    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name:  session-affinity
      namespace: default
      labels:
        app:  session-affinity
    spec:
      selector:
        matchLabels:
          app: session-affinity
      replicas: 3
      template:
        metadata:
          labels:
            app:  session-affinity
        spec:
          containers:
          - name:  session-affinity
            image:  nginx


    編寫具有會話親和的 ingress

    導語:pipdeptree是TencentOS發行版軟件包中維護python軟件包使用的工具,提供了統計和管理python包依賴鏈的能力。作為發行版軟件包的負責人,對該領域深挖且精通,是保證其穩定性和高可用性的必要條件。從從未接觸過這個工具的初學者,經歷了差不多一個月時間的努力,完成了重構工作,并收到了高星項目 `pipdeptree` 的邀請成為項目maintainer,需要的是大量的業余時間以及耐心,翻閱各種手冊以及源碼實現。

    背景

    伴隨最近AI生態越來越大,而python最為AI的底座語言,相關的包、代碼庫當然是越來越多。所以解決python的依賴地獄問題是一個可以帶來收益的優化點。這個收益主要來自兩方面:

    • 裁剪掉部分python包并不需要的依賴,可以讓AI/python項目鏡像體積更小,更輕便
    • 優化環境中的多余依賴,也可以使開發環境更加干凈,項目間的依賴關系更加明朗

    調研

    pipdeptree 工具的能力符合我們的要求,并且在TencentOS Server 4 已經集成了這個包,可以通過dnf install python3-pipdeptree 安裝。

    同時我們也在制作python runtime、pytorch等鏡像時,會通過 pipdeptree 進行優化,提供體積更小更輕便的鏡像(如果大家有需求,可以與我們聯系)。

    pipdeptree工具使用很簡單 主要使用的是下面兩條

    pipdeptree -p ABC
    以及
    pipdeptree -j

    至于他的原理,其實是使用了官方的pkg_resources庫,其中核心的為下面兩部分

    1. Environment.from_paths(None).iter_installed_distributions獲取當前環境中所有的python包(dist)
    from pip._internal.metadata import pkg_resources
    
    
    dists=pkg_resources.Environment.from_paths(None).iter_installed_distributions(
                local_only=local_only,
                skip=(),
                user_only=user_only,
            )

    2.DistInfoDistribution.requires()來獲取相關包的依賴項

    from pip._vendor.pkg_resources import DistInfoDistribution
    
    
    def requires(self) -> list[Requirement]:
        return self._obj.requires()  # type: ignore[no-untyped-call,no-any-return]

    最終生成的是環境中所有安裝包的依賴樹,例如

    {
            "package": {
                "key": "adal",
                "package_name": "adal",
                "installed_version": "1.2.7"
            },
            "dependencies": [
                {
                    "key": "cryptography",
                    "package_name": "cryptography",
                    "installed_version": "41.0.4",
                    "required_version": ">=1.1.0"
                },
                {
                    "key": "pyjwt",
                    "package_name": "PyJWT",
                    "installed_version": "2.6.0",
                    "required_version": ">=1.0.0,<3"
                },
                {
                    "key": "python-dateutil",
                    "package_name": "python-dateutil",
                    "installed_version": "2.8.2",
                    "required_version": ">=2.1.0,<3"
                },
                {
                    "key": "requests",
                    "package_name": "requests",
                    "installed_version": "2.28.2",
                    "required_version": ">=2.0.0,<3"
                }
            ]
        },
    1. 但是需要注意的是,pkg_resource生成依賴,并不會包括tox.ini也就是測試所需的依賴,這個問題需要進一步分析原因以及其合理性。
      (上述問題,經測試并不是問題,因為tox中的為測試依賴,在rpm編譯系統中應該只出現在編譯環節,并不應該作為運行依賴,所以是依賴冗余,應該刪除)
    2. 需要一并注意的是,pkg_resource接口已經deprecated,推薦使用importlib.resources代替,這里可以給上游提一提代碼。
    3. 見 :https://github.com/tox-dev/pipdeptree/pull/333/

    同時,我們發行版自己打包python包的時候也會使用rpm的依賴生成方式,也就是BuildRequires、Requires。相比于python包本身的依賴列表,發行版打包過程中很容易引入多余的依賴,導致該軟件包的依賴鏈產生了冗余。結合背景問題中提到的問題“python開發者有時候并不能完全準確的列出當前包所需的依賴”,有可能是代碼變化之后原來依賴的不依賴了等情況。

    綜上,目前想到的優化點有兩個:

    • 通過pipdeptree,先裁剪掉環境中的多余的python包,這里就是依賴各個python包開發者自己對依賴的掌控
    • 通過python調用解析工具,分析掃描這些python包里的代碼是不是真的有用到這些依賴

    嘗試結果

    第一步Demo

    import json
    import subprocess
    import re
    
    
    FILTERED_DEPENDENCIES=['python3']
    
    
    def extract_package_name(dep):
        match=re.match(r'.*python3(?:\.\d+)?dist\(([^)]+)\).*', dep)
        if match:
            return match.group(1)
        return dep.split(' ')[0]
    
    
    def get_package_dependencies(package_name):
    
    
        package_name=get_rpmname(package_name)
    
    
        print(package_name)
        try:
            command=f'rpm-dep -i {package_name} -q'
            subprocess.run(command.split())
            parse_cmd=f"jq -r '.next[] | .pkg_name' dep_tree__{package_name}__install.json | sort | uniq"
            output=subprocess.getoutput(parse_cmd)
            dependencies=output.strip().split('\n')
            return dependencies
        except subprocess.CalledProcessError:
            return []
    
    
    def get_rpmname(py_name):
        if not py_name.startswith("python-"):
            # try python3dist(ABC)
            command=f"dnf repoquery --whatprovides 'python3dist({py_name})' --latest-limit 1 --queryformat '%{{NAME}}' -q"
            output=subprocess.getoutput(command)
    
    
            # try python-ABC
            if output=="":
                # try lower case
                command=f"dnf repoquery --whatprovides 'python3dist({py_name.lower()})' --latest-limit 1 --queryformat '%{{NAME}}' -q"
                output=subprocess.getoutput(command)
                # last chance
                if output=="":
                    py_name=f"python3-{py_name}"
                    info_command=f'dnf info python3-{py_name}'
                    info_result=subprocess.run(info_command.split(), stderr=subprocess.DEVNULL, stdout=subprocess.PIPE, text=True)
                    if info_result.returncode !=0:
                        py_name="ERROR"
            else:
                py_name=output
    
    
        else:
            py_name=py_name[7:]
            info_command=f'dnf info python3-{py_name}'
            info_result=subprocess.run(info_command.split(), stderr=subprocess.DEVNULL, stdout=subprocess.PIPE, text=True)
            if info_result.returncode !=0:
                py_name="ERROR"
            else:
                py_name=f"python3-{py_name}"
    
    
        return py_name
    
    
    
    
    def check_dependencies(package_data):
        package_name=package_data['package']['key']
        local_dependencies=[get_rpmname(dep['key']) for dep in package_data['dependencies']]
        repo_dependencies=get_package_dependencies(package_name)
    
    
        missing_dependencies=list(set(repo_dependencies) - set(local_dependencies))
        extra_dependencies=list(set(local_dependencies) - set(repo_dependencies))
    
    
        #print(local_dependencies)
        #print(repo_dependencies)
    
    
        # 過濾 FILTERED_DEPENDENCIES 列表中的依賴項
        missing_dependencies=[dep for dep in missing_dependencies if dep not in FILTERED_DEPENDENCIES]
        extra_dependencies=[dep for dep in extra_dependencies if dep not in FILTERED_DEPENDENCIES]
    
    
        print(missing_dependencies)
        print(extra_dependencies)
    
    
        return {
            'package_name': get_rpmname(package_name),
            'missing_dependencies': missing_dependencies,
            'extra_dependencies': extra_dependencies
        }
    
    
    def main():
        with open('packages.json', 'r') as file:
            packages_data=json.load(file)
    
    
        result=[]
        for package_data in packages_data:
            package_result=check_dependencies(package_data)
            result.append(package_result)
    
    
        with open('result.json', 'w') as file:
            json.dump(result, file, indent=2)
    
    
    if __name__=='__main__':
        main()

    根據最終結果分析,存在以下問題會導致結果不準確:

    • 上一小節提到的,tox.ini也就是tox測試套依賴并不在依賴列表里,這會導致rpm依賴會比pipdep查找到的依賴多,比如python-oauth2client等
    • 不是一個標準的python包,或者沒有按python標準開發、寫依賴,導致本身就沒有寫依賴,這也會導致rpm依賴會比pipdeptree查找到的依賴多,比如asciidoc等

    第二步Demo

    import ast
    import importlib.metadata
    import importlib.resources
    import json
    import os
    import sys
    import re
    
    
    # 獲取內置模塊列表
    builtin_modules=set(sys.builtin_module_names)
    
    
    def get_standard_library_modules():
        lib_path=os.path.dirname(os.__file__)
        modules=[]
    
    
        def add_module(root, file):
            module_path=os.path.relpath(os.path.join(root, file), lib_path)
            module_name=os.path.splitext(module_path.replace(os.path.sep, '.'))[0]
            if module_name.endswith('.__init__'):
                module_name=module_name[:-9]
            modules.append(module_name)
    
    
        for root, dirs, files in os.walk(lib_path):
            if 'site-packages' in dirs:
                dirs.remove('site-packages')
            if root==lib_path:
                # 獲取第一層的所有 .py 文件名
                for file in files:
                    if file.endswith('.py'):
                        add_module(root, file)
    
    
            # 處理帶有 __init__.py 文件的目錄鏈
            if '__init__.py' in files:
                add_module(root, '__init__.py')
    
    
        return modules
    
    
    # 添加一些常見的標準庫模塊
    builtin_modules.update(get_standard_library_modules())
    
    
    
    
    def parse_imports(file_path):
        with open(file_path, 'r') as file:
            content=file.read()
    
    
        # 移除所有單行注釋
        content=re.sub(r'#.*', '', content)
    
    
        # 移除所有多行注釋
        content=re.sub(r'""".*?"""', '', content, flags=re.DOTALL)
    
    
        # 匹配 import 和 from import 語句
        import_re=re.compile(r'(?:from\s+([.\w]+)(?:\s+import\s+[\w, ()]+)|import\s+([\w, ()]+))')
        matches=import_re.findall(content)
    
    
        imports=[]
        for match in matches:
            # match 是一個元組,其中一個元素是空字符串,另一個元素是模塊名
            module_names=match[0] if match[0] else match[1]
            # 如果模塊名以'.'開頭,說明是相對導入,我們忽略它
            if not module_names.startswith('.'):
                module_names=module_names.split(',')
                for module_name in module_names:
                    # 處理別名導入的情況
                    module_name=module_name.strip().split(' as ')[0].split('.')[0]
                    if module_name not in builtin_modules and not module_name.startswith('_'):
                        imports.append(module_name)
    
    
        return imports
    
    
    def get_package_imports():
        package_imports={}
    
    
        dists=importlib.metadata.distributions()
        for dist in dists:
            package_name=dist.metadata['Name']
            try:
                package_dir=importlib.resources.files(package_name)
                if package_dir is not None:
                    package_imports[package_name]={}
                    for root, dirs, files in os.walk(str(package_dir)):
                        for file in files:
                            if file.endswith('.py'):
                                file_path=os.path.join(root, file)
                                imports=parse_imports(file_path)
                                # 去重并篩除當前包名
                                imports=list(set(imports))
                                if package_name in imports:
                                    imports.remove(package_name)
                                package_imports[package_name][file_path]=imports
            except:
                pass
    
    
        return package_imports
    
    
    # 獲取所有包的import信息
    package_imports=get_package_imports()
    
    
    # 轉換為JSON格式并打印
    json_data=json.dumps(package_imports, indent=4)
    print(json_data)
    
    
    # 讀取package.json文件
    with open('packages.json', 'r') as file:
        package_data=json.load(file)
    
    
    # 檢查每個包的imports是否都在dependencies中
    for package in package_data:
        package_name=package['package']['package_name']
        if package_name in package_imports:
            dependencies={dep['package_name'] for dep in package['dependencies']}
            for file_path, imports in package_imports[package_name].items():
                for import_name in imports:
                    if import_name not in dependencies:
                        print(f'In package {package_name}, file {file_path} imports {import_name} which is not in dependencies.')
                    else:
                        print(f'In package {package_name}, file {file_path} imports {import_name} is found in pipdeptree.')

    根據最終結果分析,存在以下問題會導致結果不準確:

    • ast無法區分import模塊是當前路徑下還是公有模塊,比如from .ABC import DEF中,ABC并不是一個公共python模塊,也就是ABC不應該被檢查,但是ast解析后該模塊與from ABC import DEF無異。所以會有誤報。
      該問題最終通過不使用ast解析模塊,而是直接通過文本解析來完成,因為只涉及import/from import語句,較簡單,所以可行
    • 部分包名不標準,python包名和模塊名不一致,如:
    tooz==4.2.0
    ├── fasteners [required: >=0.7, installed: 0.19]
    ├── futurist [required: >=1.2.0, installed: 2.4.1]
    ├── msgpack [required: >=0.4.0, installed: 1.0.5]
    ├── oslo.serialization [required: >=1.10.0, installed: 5.0.0]
    ├── oslo.utils [required: >=4.7.0, installed: 6.0.1]
    ├── pbr [required: >=1.6, installed: 5.11.1]
    ├── stevedore [required: >=1.16.0, installed: 4.0.2]
    ├── tenacity [required: >=5.0.0, installed: 8.2.3]
    └── voluptuous [required: >=0.8.9, installed: 0.13.1]
    
    
    In package tooz, file /usr/lib/python3.11/site-packages/tooz/drivers/etcd3.py imports oslo_utils which is not in dependencies.

    這也會導致python包路徑獲取不完整,因為通過dists=importlib.metadata.distributions()獲取的包名(例如 pycryptodome )和實際的模塊名(例如 Crypto )不一樣,在package_dir=importlib.resources.files(package_name)這一步是靠先import來找文件的,所以會直接報錯。
    該問題可以使用importlib_metadata.packages_distributions來解決,這個API返回的是每個分發包的包名和可import模塊的映射

    當前發現的問題

    依賴缺失

    這種情況不一定是問題,因為部分模塊只是被弱依賴,也就是沒有他們也能正常運行。

    可選模塊

    In package urllib3, file /usr/lib/python3.11/site-packages/urllib3/response.py imports brotli which is not in dependencies.
    
    
    try:
        try:
            import brotlicffi as brotli
        except ImportError:
            import brotli
    except ImportError:
        brotli=None

    還有比如測試代碼,也會有一些依賴缺失,這些優先級不高

    In package zake, file /usr/lib/python3.11/site-packages/zake/test.py imports testtools which is not in dependencies.

    上游開發問題

    還有就是真的是上游沒有寫好依賴,比如urllib3這個包。

    In package urllib3, file /usr/lib/python3.11/site-packages/urllib3/contrib/appengine.py imports google which is not in dependencies.
    In package urllib3, file /usr/lib/python3.11/site-packages/urllib3/contrib/socks.py imports socks which is not in dependencies.
    In package urllib3, file /usr/lib/python3.11/site-packages/urllib3/contrib/pyopenssl.py imports OpenSSL which is not in dependencies.
    In package urllib3, file /usr/lib/python3.11/site-packages/urllib3/contrib/pyopenssl.py imports idna which is not in dependencies.
    In package urllib3, file /usr/lib/python3.11/site-packages/urllib3/contrib/pyopenssl.py imports cryptography which is not in dependencies.
    In package urllib3, file /usr/lib/python3.11/site-packages/urllib3/contrib/ntlmpool.py imports ntlm which is not in dependencies.

    再比如tox缺失了部份依賴

    tox==4.10.0
    ├── cachetools [required: Any, installed: 5.3.1]
    ├── chardet [required: >=5.2, installed: 5.2.0]
    ├── colorama [required: >=0.4.6, installed: 0.4.6]
    ├── filelock [required: Any, installed: 3.12.4]
    ├── packaging [required: Any, installed: 23.1]
    ├── platformdirs [required: Any, installed: 2.5.4]
    ├── pluggy [required: Any, installed: 1.3.0]
    ├── pyproject-api [required: Any, installed: 1.5.1]
    │   └── packaging [required: >=23, installed: 23.1]
    └── virtualenv [required: >=20, installed: 20.21.1]
        ├── distlib [required: >=0.3.6,<1, installed: 0.3.7]
        ├── filelock [required: >=3.4.1,<4, installed: 3.12.4]
        └── platformdirs [required: >=2.4,<4, installed: 2.5.4]
    In package tox, file /usr/lib/python3.11/site-packages/tox/tox_env/python/virtual_env/package/pyproject.py imports tomli which is not in dependencies.
    n package tox, file /usr/lib/python3.11/site-packages/tox/execute/local_sub_process/read_via_thread_unix.py imports select which is not in dependencies.

    還有tooz等。

    In package tooz, file /usr/lib/python3.11/site-packages/tooz/drivers/pgsql.py imports psycopg2 which is not in dependencies.
    In package tooz, file /usr/lib/python3.11/site-packages/tooz/drivers/mysql.py imports pymysql which is not in dependencies.

    依賴冗余

    通過工具結果也發現了部分python包存在依賴冗余的情況,比如cheroot

    {
        "package_name": "python3-cheroot",
        "missing_dependencies": [
          "python3-six",
          "python3-pyOpenSSL"
        ],
        "extra_dependencies": []
      },

    結果顯示存在2個冗余依賴,其中python3-six這個包是python3兼容python2兼容包,在當前版本cheroot已經完全適配。所以該依賴在上游已刪除 https://github.com/cherrypy/cheroot/commit/f3170d40a699219345abb5813395ff39319fec86

    pyOpenSSLcheroot-10.0.0/stubtest_allowlist.txt中,屬于測試依賴,可以僅作為編譯依賴,而在運行依賴裁剪。參考其他發行版如suse已采取過相同裁剪 https://build.opensuse.org/projects/openSUSE:Factory/packages/python-cheroot/files/python-cheroot.changes 參考: https://gitee.com/opencloudos-stream/python-cheroot/pulls/3

    循環依賴

    Warning!! Cyclic dependencies found:
    * sphinxcontrib-serializinghtml=> Sphinx=> sphinxcontrib-serializinghtml
    * sphinxcontrib-htmlhelp=> Sphinx=> sphinxcontrib-htmlhelp
    * sphinxcontrib-qthelp=> Sphinx=> sphinxcontrib-qthelp
    * sphinxcontrib-applehelp=> Sphinx=> sphinxcontrib-applehelp
    * sphinxcontrib-devhelp=> Sphinx=> sphinxcontrib-devhelp
    * Sphinx=> sphinxcontrib-applehelp=> Sphinx

    軟件包存在循環依賴會使其編譯構建受到依賴版本的影響,會讓原本單一的依賴鏈變得復雜。

    進階:pipdeptree 基于新API重構

    upstream pr:

    https://github.com/tox-dev/pipdeptree/pull/333

    背景

    pipdeptree這個項目在tox項目下,是除了tox本身最高星的項目,由現就職于bloomberg公司的gaborbernat開發,這個作者也是virtualenv, tox, platformdirs, filelock等一系列python中比較重要的社區的創始人,以及python-build等其他核心社區的maintainer,是python圈內比較知名的大佬。

    要將已經deprecated的APi pkg_resources廢除,替換為importlib.metadata以及packaging等,需要重構pipdeptree的核心邏輯,涉及到整個pipdeptree的代碼樹,所以比較復雜、坑也比較多。

    pkg_resources 和 importlib.metadata 如何兼容:

    基礎類型

    • 項目中每個python分發包,是一個DistPackage對象,他是Package的子類,后者是pkg_resources.DistInfoDistribution的子類
    from pip._vendor.pkg_resources import DistInfoDistribution

    DistInfoDistribution可以通過importlib.metadata.Distribution替代。
    但是需要注意的是
    importlib.metadata.Distribution不再有key以及project_name屬性,所以涉及的地方需要替換為metdadata.Distribution.metadata["Name"]
    例如:

    def __init__(self, obj: Distribution, req: ReqPackage | None=None) -> None:
            super().__init__(obj.metadata["Name"])
    • 每個分發包的依賴包,是一個Requirement對象
    from pip._vendor.pkg_resources import Requirement

    它可以通過from packaging.requirements import Requirement替代。
    與上面一樣,不再有
    key以及project_name屬性,所以需要替代為.name,如

    def __init__(self, obj: Requirement, dist: DistPackage | None=None) -> None:
            super().__init__(obj.name)

    基礎類型下的屬性和API

    local_only and user_only

    基礎類型的替換,并不能完全解決問題,更多的問題是需要解決該基礎類型支持的屬性以及API。 iter_installed_distributions這個API,他的作用是根據參入參數返回一個DistInfoDistribution列表

    iter_installed_distributions(local_only: bool=True, skip: Container[str]={'python', 'wsgiref', 'argparse'}, include_editables: bool=True, editables_only: bool=False, user_only: bool=False) -> Iterator[pip._internal.metadata.base.BaseDistribution] method of pip._internal.metadata.pkg_resources.Environment instance
        Return a list of installed distributions.

    涉及代碼如下,我們需要解決的就是這三個參數,也就是local_onlyuser_only怎么通過新API來區分。

    from pip._internal.metadata import pkg_resources
    dists=pkg_resources.Environment.from_paths(None).iter_installed_distributions(
                local_only=local_only,
                skip=(),
                user_only=user_only,
            )

    local_only作用是區分虛擬環境和全局環境

    (myenv) [root@linux ~]# python3 -c "import sys;print(sys.path)"
    ['', '/usr/lib64/python311.zip', '/usr/lib64/python3.11', '/usr/lib64/python3.11/lib-dynload', '/root/myenv/lib64/python3.11/site-packages', '/root/myenv/lib/python3.11/site-packages']
    (myenv) [root@linux ~]# python3 -c "import sys;print(sys.prefix)"
    /root/myenv
    (myenv) [root@linux ~]# python3 -c "import sys;print(sys.base_prefix)"
    /usr
    (myenv) [root@linux ~]#

    pip中的判斷邏輯如下

    def _running_under_venv() -> bool:
        """Checks if sys.base_prefix and sys.prefix match.
    
    
        This handles PEP 405 compliant virtual environments.
        """
        return sys.prefix !=getattr(sys, "base_prefix", sys.prefix)
    
    
    
    
    def _running_under_legacy_virtualenv() -> bool:
        """Checks if sys.real_prefix is set.
    
    
        This handles virtual environments created with pypa's virtualenv.
        """
        # pypa/virtualenv case
        return hasattr(sys, "real_prefix")
    
    
    
    
    def running_under_virtualenv() -> bool:
        """True if we're running inside a virtual environment, False otherwise."""
        return _running_under_venv() or _running_under_legacy_virtualenv()

    所以我們直接簡化這部分邏輯,判斷sys.prefixsys.base_prefix是否相同,如果不相同則說明當前處在虛擬環境中,如果相同則說明處在系統環境。然后通過site.getsitepackages()獲取指定前綴下的python路徑(site-packages)。

    in_venv=sys.prefix !=sys.base_prefix
    
    
        if local_only and in_venv:
            venv_site_packages=site.getsitepackages([sys.prefix])
            return list(distributions(path=venv_site_packages))

    最后通過importlib.metadata.distributions函數找到該python路徑下的所有python分發包,并返回一個Distribution列表 這個API是對Distribution.discover的封裝

    def distributions(**kwargs):
        """Get all ``Distribution`` instances in the current environment.
    
    
        :return: An iterable of ``Distribution`` instances.
        """
        return Distribution.discover(**kwargs)

    后者會將接收到的內容用Context封裝后,最終通過sys.meta_path里面的元數據查找器來查找系統中的所有分發包。

    [root@linux ~]# python3 -c "import sys; print(sys.meta_path)"
    [<_distutils_hack.DistutilsMetaFinder object at 0x7f124cd15bd0>, <class '_frozen_importlib.BuiltinImporter'>, <class '_frozen_importlib.FrozenImporter'>, <class '_frozen_importlib_external.PathFinder'>]

    user_only的作用,則是區分用戶環境和全局環境 所以這里我們直接通過site.getusersitepackages()獲取用戶的site_packages目錄然后一樣distributions獲取分發包列表。

    if user_only:
            return list(distributions(path=[site.getusersitepackages()]))

    依賴獲取API Distribution.requires VS DistInfoDistribution.requires()

    DistInfoDistribution.requires() 會直接返回 pip._vendor.pkg_resources.Requirement 類型。而Distribution.requires :https://github.com/python/cpython/blob/3.12/Lib/importlib/metadata/__init__.py#L558則會返回純字符串。

    @property
        def requires(self):
            """Generated requirements specified for this Distribution"""
            reqs=self._read_dist_info_reqs() or self._read_egg_info_reqs()
            return reqs and list(reqs)

    所以我們需要對獲取的字符串進行處理,再將其轉換為packaging.requirements.Requirement對象。

    def requires(self) -> list[Requirement]:
            req_list=[]
            req_name_list=[]
            if self._obj.requires:
                for r in self._obj.requires:
                    req=Requirement(r)
                    is_extra_req=req.marker and contains_extra(str(req.marker))
                    if not is_extra_req and req.name not in req_name_list:
                        req_list.append(req)
                        req_name_list.append(req.name)
            return req_list

    并且需要注意的是,Distribution.requires 現在返回的依賴中會包含marker,例如下面示例

    "pytest ; extra=='tests'"

    根據討論1:https://github.com/tox-dev/pipdeptree/pull/333#discussion_r1527662006

    以及

    討論2:https://github.com/tox-dev/pipdeptree/pull/333#discussion_r1527881146

    我們目前只需要保證主要的依賴就可以,后續如果需要支持marker再繼續拓展。

    備注: 這里的marker主要的含義就是,安裝包的時候可能需要的某些附加功能,例如:如果需要安裝CT3的markdown相關功能 需要在安裝的時候指定

    pip install CT3[markdown]

    因為CT3的METADATA中指定了

    Provides-Extra: filters
    Requires-Dist: markdown ; extra=='filters'
    Provides-Extra: markdown
    Requires-Dist: markdown ; extra=='markdown'

    Marker包含了extra以及其他的例如python版本限定python_version < 3.11等。不同的extra會有單獨的依賴,這就意味著比如CT3這個模塊,他是可以不支持markdown能力的,所以主模塊并不是一定需要該依賴,這里也是為什么社區建議先不考慮extra的原因。如果這里需要將extra也列出來,有以下幾個辦法:

    1. 斷當前系統中的主包是如何安裝的
    2. 判斷如果extra在系統中,就認為此功能已啟用,就將其列到依賴中

    FrozenRequirement.from_dist兼容

    as_frozen_repr 函數中使用的 metadata.pkg_resources.Distribution這個API傳入FrozenRequirement.from_dist時會使用到以下幾個屬性,但是DistPackage以及importlib.metadata.Distribution都沒有,所以這里我們需要自己實現。

    @property
        def editable(self) -> bool:
            return bool(self.editable_project_location)
    
    
        @property
        def direct_url(self) -> DirectUrl | None:
            direct_url_metadata_name="direct_url.json"
            result=None
    
    
            try:
                j_content=self._obj.read_text(direct_url_metadata_name)
            except FileNotFoundError:  # pragma: no cover
                return result
            try:
                if j_content:
                    result=DirectUrl.from_json(j_content)
    
    
            except (
                UnicodeDecodeError,
                json.JSONDecodeError,
                DirectUrlValidationError,
            ):
                return result
            return result
    
    
        @property
        def raw_name(self) -> str:
            return self.project_name
    
    
        @property
        def editable_project_location(self) -> str | None:
            direct_url=self.direct_url
            if direct_url and direct_url.is_local_editable():
                from pip._internal.utils.urls import url_to_path  # noqa: PLC2701, PLC0415
    
    
                return url_to_path(direct_url.url)
    
    
            result=None
            egg_link_path=egg_link_path_from_sys_path(self.raw_name)
            if egg_link_path:
                with Path(egg_link_path).open("r") as f:
                    result=f.readline().rstrip()
            return result

    這部分的兼容參考這里的討論:https://github.com/tox-dev/pipdeptree/pull/333#discussion_r1533235445。我們需要做的是獲取python包對應的direct_url.json文件,然后將其中的數據解析出來賦值給DistPackage的成員,具體實現參考了pip中的direct_url實現,見 源碼:https://github.com/pypa/pip/blob/f5e4ee104e7b171a7cfb2843c9c602abf7a4e346/src/pip/_internal/metadata/base.py#L289

    同時還需要自己實現editable_project_location接口,這個接口的實現參考 鏈接:https://github.com/pypa/pip/blob/f5e4ee104e7b171a7cfb2843c9c602abf7a4e346/src/pip/_internal/utils/egg_link.py#L33

    Python分發包的可編輯模式和DirectUrl模塊

    這里簡單介紹一下Python中的DirectUrl模塊,這個模塊主要是對direct_url.json文件的解析和使用。首先,并不是所有python包都會有direct_url.json文件,常見的是通過URLs也就是鏈接安裝的分發包才會有,這個鏈接可以是本地鏈接,也可以是遠端鏈接。見 Direct URL 介紹:https://packaging.python.org/en/latest/specifications/direct-url-data-structure/

    # pip install -e munkres-1.1.4/

    然后你就會在python目錄下看到他的.dist-info目錄中存在direct_url.json

    # ls /usr/local/lib/python3.11/site-packages/munkres-1.1.4.dist-info/
    INSTALLER  LICENSE.md  METADATA  RECORD  REQUESTED  WHEEL  direct_url.json  top_level.txt

    這個文件中記錄了這個分發包的真正的路徑url以及他是否屬于可編輯模式editable。

    {"dir_info": {"editable": true}, "url": "file:///data/gitee/python-munkres/munkres-1.1.4"}

    Python中的editable,指的是直接將項目源碼鏈接到python目錄(通常為site-package),常用于項目開發階段,不用每次修改代碼后再走安裝流程,修改的代碼會直接生效。除了上面說的通過pip install URLs之外,還有另一種.egg-link機制。當你在python源碼路徑中執行python3 setup.py develop的時候,會出現下面一段日志。

    running egg_info
    writing munkres.egg-info/PKG-INFO
    writing dependency_links to munkres.egg-info/dependency_links.txt
    writing top-level names to munkres.egg-info/top_level.txt
    reading manifest file 'munkres.egg-info/SOURCES.txt'
    reading manifest template 'MANIFEST.in'
    adding license file 'LICENSE.md'
    writing manifest file 'munkres.egg-info/SOURCES.txt'
    running build_ext
    Creating /usr/local/lib/python3.11/site-packages/munkres.egg-link (link to .)
    munkres 1.1.4 is already the active version in easy-install.pth
    
    
    Installed /data/gitee/python-munkres/munkres-1.1.4
    Processing dependencies for munkres==1.1.4
    Finished processing dependencies for munkres==1.1.4

    他會在/usr/local/lib/python3.11/site-packages/下創建一個munkres.egg-link,這個.egg-link文件中寫的就是項目源碼的路徑

    # cat /usr/local/lib/python3.11/site-packages/munkres.egg-link
    /data/gitee/python-munkres/munkres-1.1.4
    .

    所以源碼中self.location的值就可以通過讀取該.egg-link文件獲得。這樣,這段邏輯就可以重寫為下面這樣。

    @property
        def editable_project_location(self) -> str | None:
            if self.direct_url:
                from pip._internal.utils.urls import url_to_path  # noqa: PLC2701, PLC0415
    
    
                return url_to_path(self.direct_url)
    
    
            egg_link_path=egg_link_path_from_sys_path(self.raw_name)
            if egg_link_path:
                with Path(egg_link_path).open("r") as f:
                    location=f.readline().rstrip()
                return location
            return None

    版本以及比較運算符

    pkg_resources.Requirement有一個參數specs,返回的是a>=1.2中的>=1.2也就是版本控制字段。見代碼:https://github.com/pypa/pip/blob/f5e4ee104e7b171a7cfb2843c9c602abf7a4e346/src/pip/_vendor/pkg_resources/__init__.py#L3153 替換為packaging.requirements.Requirementspecifier字段,參考手冊:https://packaging.pypa.io/en/stable/specifiers.html。但是需要注意的是,這里返回的是SpecifierSet對象,需要轉成str類型處理。

    @property
        def version_spec(self) -> str | None:
            result=None
            specs=sorted(map(str, self._obj.specifier), reverse=True)
            if specs:
                result=",".join(specs)
            return result

    但是進行版本對比的時候最好還是使用對象,使用自帶的對比方法,如

    if ver_spec:
                req_obj=SpecifierSet(ver_spec)
            else:
                return False
    
    
            return self.installed_version not in req_obj

    測試用例修復

    Mock介紹

    Mock模塊,是python單測里經常用到的模塊,它可以模仿一個假的函數或者對象。

    模仿函數 例如下面這樣,規定了foo.read_text這個函數的返回值為json_text,也就是只要執行了foo.read_text,他的返回值一定是json_text

    foo.read_text=Mock(return_value=json_text)

    模仿對象 Mock還可以模仿一個對象,制定了模仿對象的屬性之后,其他代碼中如果調用foo.metadata["Name"],返回的就是foo。

    foo=Mock(metadata={"Name": "foo"}, version="20.4.1")

    所以,我們通過這種方式,模擬出了一個完整的流程,從對象的屬性,到函數方法,這樣,測試的目的函數就可以返回我們期望的內容。例如:

    def test_dist_package_render_as_root_with_frozen() -> None:
        json_text='{"dir_info": {"editable": true}, "url": "file:///A/B/foo"}'
        foo=Mock(metadata={"Name": "foo"}, version="20.4.1")
        foo.read_text=Mock(return_value=json_text)
        dp=DistPackage(foo)
        is_frozen=True
        expect="# Editable install with no version control (foo===20.4.1)\n-e /A/B/foo"
        assert dp.render_as_root(frozen=is_frozen)==expect

    Mock不支持name屬性設置

    因為Mock類本身限制問題,Mock(name=XXX)的聲明無效,即調用其name屬性時會得到一個Mock.name對象。所以這里需要MagicMock的介入,比如

    -    result=ReqPackage(mocker.MagicMock(key="setuptools")).installed_version
    +    r=MagicMock()
    +    r.name="setuptools"
    +    result=ReqPackage(r).installed_version

    python虛擬環境測試不通過

    如這里:https://github.com/tox-dev/pipdeptree/pull/333/#issuecomment-2018311809所說,通過virtualenv.cli_run去運行一個虛擬環境,并且在其中執行命令時,需要通過pytest.CaptureFixture去捕捉結果,但是代碼中只捕獲了正常輸出,丟棄了異常輸出,導致實際報錯的命令并沒有顯示錯誤。

    out, _=capfd.readouterr()

    最后定位為新增了新的API依賴packaging,導致虛擬環境缺少依賴命令執行錯誤。需要在測試環境中安裝packaging以解決。

    -        expected={"pip", "setuptools", "wheel"}
    +        expected={"packaging", "pip", "setuptools", "wheel"}

    進一步優化:因為python的測試環境最好不與外網聯通,所以在測試環境里pip install packaging的方式非常的且不穩定。

    所以,我們決定對工具的--python選項邏輯進行重構。通過將外部環境(運行pipdeptree的環境)中的packaging “偷"進測試空間。

    packaging_src=getsourcefile(sys.modules["packaging"])
    assert packaging_src is not None
    packaging_root=Path(packaging_src).parent
    copytree(packaging_root, dest / "packaging")
    cmd=[str(py_path), "-m", "pipdeptree", *argv]
    env=os.environ.copy()
    return call(cmd, cwd=project, env=env)

    充分利用python -m會將當前目錄加入sys.path這一特性。如下:

    By default, as initialized upon program startup, a potentially unsafe path is prepended to [`sys.path`](https://docs.python.org/3/library/sys.html#sys.path) (before the entries inserted as a result of [`PYTHONPATH`](https://docs.python.org/3/using/cmdline.html#envvar-PYTHONPATH)):
    - `python -m module` command line: prepend the current working directory.
    - `python script.py` command line: prepend the script’s directory. If it’s a symbolic link, resolve symbolic links.
    - `python -c code` and `python` (REPL) command lines: prepend an empty string, which means the current working directory.

    模擬虛擬環境

    因為pipdeptree中--local-only選項的存在,我們需要模擬工具在虛擬環境中的執行。Python的虛擬環境,其實就是將當前的sys.prefix設置為自定義目錄,然后使用虛擬環境中的一套python套件。所以,我們通過以下方式模擬了pipdeptree在虛擬環境中執行的效果。通過monkeypatch來設置一些環境變量,例如sys.prefix以及傳入的參數argv。

    def test_local_only(
        tmp_path: Path,
        monkeypatch: pytest.MonkeyPatch,
        capfd: pytest.CaptureFixture[str],
    ) -> None:
        prefix=str(tmp_path / "venv")
        result=virtualenv.cli_run([str(tmp_path / "venv"), "--activators", ""])
        pip_path=str(result.creator.exe.parent / "pip")
        subprocess.run(
            [pip_path, "install", "wrapt", "--prefix", prefix],
            stdout=subprocess.DEVNULL,
            stderr=subprocess.DEVNULL,
            check=False,
        )
        cmd=[str(result.creator.exe.parent / "python3")]
        monkeypatch.chdir(tmp_path)
        cmd +=["--local-only"]
        monkeypatch.setattr(sys, "prefix", [str(tmp_path / "venv")])
        monkeypatch.setattr(sys, "argv", cmd)
        main()
        out, _=capfd.readouterr()
        found={i.split("==")[0] for i in out.splitlines()}
        expected={"wrapt", "pip", "setuptools", "wheel"}
    
    
        if sys.version_info >=(3, 12):
            expected -={"setuptools", "wheel"}  # pragma: no cover
    
    
        assert found==expected


    總結

    完成上述核心代碼的重構后,得到了包括gaborbernat在內的兩位核心maintainer的認可。最終接收了邀請,成為該項目的一員。





    作者:cunshun

    來源-微信公眾號:鵝廠架構師

    出處:https://mp.weixin.qq.com/s/dSWwbxzDfuxLWu0Achg8jg

網站首頁   |    關于我們   |    公司新聞   |    產品方案   |    用戶案例   |    售后服務   |    合作伙伴   |    人才招聘   |   

友情鏈接: 餐飲加盟

地址:北京市海淀區    電話:010-     郵箱:@126.com

備案號:冀ICP備2024067069號-3 北京科技有限公司版權所有