python pycurl模块
转自:http://www.cnblogs.com/gide/p/5650655.html
一、pycurl概述
PycURl是一个C语言写的libcurl的python绑定库。libcurl 是一个自由的,并且容易使用的用在客户端的 URL 传输库。它的功能很强大,在PyCURL的主页上介绍的支持的功能有:FTP, FTPS, HTTP, HTTPS, GOPHER, TELNET, DICT, FILE and LDAP. libcurl supports HTTPS certificates, HTTP POST, HTTP PUT, FTP uploading, kerberos, HTTP form based upload, proxies, cookies, user+password authentication, file transfer resume, http proxy tunneling and more!
由于PycURl 是由C语言原生实现的,所以一般来说会比其会比纯python实现的liburl、liburl2模块快不少,可能也会比Requests的效率更高。特别是使用PycURL的多并发请求时,效率更高。
如果机器没有PycURL 通过命令 sudo easy_install pycurl
二、pycurl 的用法
实例一:
#! /usr/bin/env python # -*- coding: utf-8 -*- import sys import pycurl import time class Test: def __init__(self): self.contents = "" def body_callback(self, buf): self.contents = self.contents + buf sys.stderr.write("Testing %sn" % pycurl.version) start_time = time.time() url = "http://www.dianping.com/beijing" t = Test() c = pycurl.Curl() c.setopt(c.URL, url) c.setopt(c.WRITEFUNCTION, t.body_callback) c.perform() end_time = time.time() duration = end_time - start_time print c.getinfo(pycurl.HTTP_CODE), c.getinfo(pycurl.EFFECTIVE_URL) c.close() print "pycurl takes %s seconds to get %s " % (duration, url) print "lenth of the content is %d" % len(t.contents) #print(t.contents)
实例二:封装get post函数
import pycurl import StringIO import urllib #------------------------自动处理cookile的函数----------------------------------# def initCurl(): """初始化一个pycurl对象, 尽管urllib2也支持 cookie 但是在登录cas系统时总是失败,并且没有搞清楚失败的原因。 这里采用pycurl主要是因为pycurl设置了cookie后,可以正常登录Cas系统 """ c = pycurl.Curl() c.setopt(pycurl.COOKIEFILE, "cookie_file_name")#把cookie保存在该文件中 c.setopt(pycurl.COOKIEJAR, "cookie_file_name") c.setopt(pycurl.FOLLOWLOCATION, 1) #允许跟踪来源 c.setopt(pycurl.MAXREDIRS, 5) #设置代理 如果有需要请去掉注释,并设置合适的参数 #c.setopt(pycurl.PROXY, ‘http://11.11.11.11:8080′) #c.setopt(pycurl.PROXYUSERPWD, ‘aaa:aaa’) return c #-----------------------------------get函数-----------------------------------# def GetDate(curl, url): """获得url指定的资源,这里采用了HTTP的GET方法 """ head = ["Accept:*/*", "User-Agent:Mozilla/5.0 (Windows NT 6.1; WOW64; rv:32.0) Gecko/20100101 Firefox/32.0"] buf = StringIO.StringIO() curl.setopt(pycurl.WRITEFUNCTION, buf.write) curl.setopt(pycurl.URL, url) curl.setopt(pycurl.HTTPHEADER, head) curl.perform() the_page =buf.getvalue() buf.close() return the_page #-----------------------------------post函数-----------------------------------# def PostData(curl, url, data): """提交数据到url,这里使用了HTTP的POST方法 备注,这里提交的数据为json数据, 如果需要修改数据类型,请修改head中的数据类型声明 """ head = ["Accept:*/*", "Content-Type:application/xml", "render:json", "clientType:json", "Accept-Charset:GBK,utf-8;q=0.7,*;q=0.3", "Accept-Encoding:gzip,deflate,sdch", "Accept-Language:zh-CN,zh;q=0.8", "User-Agent:Mozilla/5.0 (Windows NT 6.1; WOW64; rv:32.0) Gecko/20100101 Firefox/32.0"] buf = StringIO.StringIO() curl.setopt(pycurl.WRITEFUNCTION, buf.write) curl.setopt(pycurl.POSTFIELDS, data) curl.setopt(pycurl.URL, url) curl.setopt(pycurl.HTTPHEADER, head) curl.perform() the_page = buf.getvalue() #print the_page buf.close() return the_page #-----------------------------------post函数-----------------------------------# c = initCurl() html = GetDate(c, "http://www.baidu.com") print html
实例三:将短链接转化为实际的url地址
import StringIO import pycurl c = pycurl.Curl() str = StringIO.StringIO() c.setopt(pycurl.URL, "http://t.cn/Rhevig4") c.setopt(pycurl.WRITEFUNCTION, str.write) c.setopt(pycurl.FOLLOWLOCATION, 1) c.perform() print c.getinfo(pycurl.EFFECTIVE_URL)
声明:该文观点仅代表作者本人,入门客AI创业平台信息发布平台仅提供信息存储空间服务,如有疑问请联系rumenke@qq.com。
- 上一篇: SqlParameter数组添加
- 下一篇: asp.net中后台c#数组与前台js数组交互