使用C#操作Web

我们知道, .Net类库里提供了HttpWebRequest等类,方便我们编程与Web服务器进行交互. 但是实际使用中我们经常会遇到以下需求,基础类里没有直接提供相应的功能:
  • 对HttpWebResponse获取的HTML进行文字编码转换,使之不会出现乱码; 
  • 自动在Session间保持Cookie,Referer等相关信息; 
  • 模拟HTML表单提交; 
  • 向服务器上传文件; 
  • 对二进制的资源,直接获取返回的字节数组(byte[]),或者保存为文件


为了解决这些问题,我开发了HttpClient类.下面是使用的方法:


获取编码转换后的字符串

 
HttpClient client=new HttpClient(url);
string html=client.GetString();

GetString()函数内部会查找Http Headers, 以及HTML的Meta标签,试图找出获取的内容的编码信息.如果都找不到,它会使用client.DefaultEncoding, 这个属性默认为utf-8, 也可以手动设置.


自动保持Cookie, Referer

 
HttpClient client=new HttpClient(url1, null, true);
string html1=client.GetString();
client.Url=url2;
string html2=client.GetString();

这里HttpClient的第三个参数,keepContext设置为真时,HttpClient会自动记录每次交互时服务器对Cookies进行的操作,同时会以前一次请求的Url为Referer.在这个例子里,获取html2时,会把url1作为Referer, 同时会向服务器传递在获取html1时服务器设置的Cookies. 当然,你也可以在构造HttpClient时直接提供第一次请求要发出的Cookies与Referer:
 
HttpClient client=new HttpClient(url, new WebContext(cookies, referer), true);

或者,在使用过程中随时修改这些信息:
 
client.Context.Cookies=cookies;
client.Context.referer=referer;



模拟HTML表单提交

 
HttpClient client=new HttpClient(url);
client.PostingData.Add(fieldName1, filedValue1);
client.PostingData.Add(fieldName2, fieldValue2);
string html=client.GetString();

上面的代码相当于提交了一个有两个input的表单. 在PostingData非空,或者附加了要上传的文件时(请看下面的上传和文件), HttpClient会自动把HttpVerb改成POST, 并将相应的信息附加到Request上.


向服务器上传文件

 
HttpClient client=new HttpClient(url);
client.AttachFile(fileName, fieldName);
client.AttachFile(byteArray, fileName, fieldName);
string html=client.GetString();

这里面的fieldName相当于<input type="file" name="fieldName" />里的fieldName. fileName当然就是你想要上传的文件路径了. 你也可以直接提供一个byte[] 作为文件内容, 但即使如此,你也必须提供一个文件名,以满足HTTP规范的要求.


不同的返回形式


 
//字符串: 
string html = client.GetString();
//流: 
Stream stream = client.GetStream();
//字节数组: 
byte[] data = client.GetBytes();
//保存到文件:  
client.SaveAsFile(fileName);
//或者,你也可以直接操作
HttpWebResponse: HttpWebResponse res = client.GetResponse();

每调用一次上述任何一个方法,都会导致发出一个HTTP Request, 也就是说,你不能同时得到某个Response的两种返回形式.

另外,调用后它们任意一个之后,你可以通过client.ResponseHeaders来获取服务器返回的HTTP头.
计划中还有另外一些功能要加进来,比如断点续传, 多线程下载, 下载进度更新的事件机制等, 正在思考如何与现在的代码融合到一起,期待你的反馈.

你可以从这里下载目前版本的全部代码.
http://files.cnblogs.com/deerchao/httpclient.zip
 
using System;
using System.Collections.Generic;
using System.IO;
using System.Text;
using System.Net;
using System.Web;

namespace Deerchao.Utility
{
    public class HttpClient
    {
        #region fields
        private bool keepContext;
        private string defaultLanguage = "zh-CN";
        private string defaultEncoding = "utf-8";
        private string accept = "*/*";
        private string userAgent = "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2;
 SV1; .NET CLR 1.1.4322; .NET CLR 2.0.50727)";
        private HttpVerb verb;
        private WebContext context;
        private readonly List<HttpUploadingFile> files = new List<HttpUploadingFile>();
        private readonly Dictionary<string, string> postingData = new Dictionary<string, string>();
        private string url;
        private WebHeaderCollection responseHeaders;
        #endregion

        #region property
        public bool KeepContext
        {
            get { return keepContext; }
            set { keepContext = value; }
        }

        public string DefaultLanguage
        {
            get { return defaultLanguage; }
            set { defaultLanguage = value; }
        }

        public string DefaultEncoding
        {
            get { return defaultEncoding; }
            set { defaultEncoding = value; }
        }

        public HttpVerb Verb
        {
            get { return verb; }
            set { verb = value; }
        }

        public List<HttpUploadingFile> Files
        {
            get { return files; }
        }

        public Dictionary<string, string> PostingData
        {
            get { return postingData; }
        }

        public string Url
        {
            get { return url; }
            set { url = value; }
        }

        public WebHeaderCollection ResponseHeaders
        {
            get { return responseHeaders; }
        }

        public string Accept
        {
            get { return accept; }
            set { accept = value; }
        }

        public string UserAgent
        {
            get { return userAgent; }
            set { userAgent = value; }
        }

        public WebContext Context
        {
            get { return context; }
            set { context = value; }
        }
        #endregion

        #region constructor
        public HttpClient()
        {
        }

        public HttpClient(string url)
            : this(url, null)
        {
        }

        public HttpClient(string url, WebContext context)
            : this(url, context, false)
        {
        }

        public HttpClient(string url, WebContext context, bool keepContext)
        {
            this.url = url;
            this.context = context;
            this.keepContext = keepContext;
        }
        #endregion

        public void AttachFile(string fileName, string fieldName)
        {
            HttpUploadingFile file = new HttpUploadingFile(fileName, fieldName);
            files.Add(file);
        }

        public void AttachFile(byte[] data, string fileName, string fieldName)
        {
            HttpUploadingFile file = new HttpUploadingFile(data, fileName, fieldName);
            files.Add(file);
        }

        public void ClearLastRequestInfo()
        {
            verb = HttpVerb.GET;
            files.Clear();
            postingData.Clear();
            responseHeaders = null;
            url = null;
        }

        private HttpWebRequest CreateRequest()
        {
            HttpWebRequest req = (HttpWebRequest) WebRequest.Create(url);
            req.AllowAutoRedirect = false;
            req.CookieContainer = new CookieContainer();
            req.Headers.Add("Accept-Language", defaultLanguage);
            req.Accept = accept;
            req.UserAgent = userAgent;
            req.KeepAlive = false;

            if (context != null)
            {
                if (context.Cookies != null)
                    req.CookieContainer.Add(context.Cookies);
                if (!string.IsNullOrEmpty(context.Referer))
                    req.Referer = context.Referer;
            }

            if (postingData.Count > 0 || files.Count > 0)
                verb = HttpVerb.POST;

            if (verb == HttpVerb.POST)
            {
                req.Method = "POST";

                MemoryStream memoryStream = new MemoryStream();
                StreamWriter writer = new StreamWriter(memoryStream);

                if (files.Count > 0)
                {
                    string newLine = "\r\n";
                    string boundary = Guid.NewGuid().ToString().Replace("-", "");
                    req.ContentType = "multipart/form-data; boundary=" + boundary;

                    foreach (string key in postingData.Keys)
                    {
                        writer.Write("--" + boundary + newLine);
                        writer.Write("Content-Disposition: form-data; name=\"{0}\"{1}{1}", key, newLine);
                        writer.Write(postingData[key] + newLine);
                    }

                    foreach (HttpUploadingFile file in files)
                    {
                        writer.Write("--" + boundary + newLine);
                        writer.Write(
                            "Content-Disposition: form-data; name=\"{0}\"; filename=\"{1}\"{2}",
                            file.FieldName,
                            file.FileName,
                            newLine
                            );
                        writer.Write("Content-Type: application/octet-stream" + newLine + newLine);
                        writer.Flush();
                        memoryStream.Write(file.Data, 0, file.Data.Length);
                        writer.Write(newLine);
                    }
                }
                else
                {
                    req.ContentType = "application/x-www-form-urlencoded";
                    StringBuilder sb = new StringBuilder();
                    foreach (string key in postingData.Keys)
                    {
                        sb.AppendFormat("{0}={1}&", key, postingData[key]);
                    }
                    if (sb.Length > 0)
                        sb.Length--;
                    writer.Write(HttpUtility.UrlEncode(sb.ToString()));
                }

                writer.Flush();

                using (Stream stream = req.GetRequestStream())
                {
                    memoryStream.WriteTo(stream);
                }
            }
            return req;
        }


        public HttpWebResponse GetResponse()
        {
            HttpWebRequest req = CreateRequest();
            HttpWebResponse res = (HttpWebResponse) req.GetResponse();
            responseHeaders = res.Headers;
            if (keepContext)
            {
                if (context == null)
                    context = new WebContext();
                context.Cookies = res.Cookies;
                context.Referer = url;
            }
            return res;
        }

        public Stream GetStream()
        {
            return GetResponse().GetResponseStream();
        }

        public byte[] GetBytes()
        {
            HttpWebResponse res = GetResponse();

            MemoryStream memoryStream = new MemoryStream();
            byte[] buffer = new byte[0x400];
            Stream rs = res.GetResponseStream();
            for (int i = rs.Read(buffer, 0, buffer.Length); 
i > 0; i = rs.Read(buffer, 0, buffer.Length))
            {
                memoryStream.Write(buffer, 0, i);
            }
            rs.Close();

            return memoryStream.ToArray();
        }

        public string GetString()
        {
            byte[] data = GetBytes();
            string encoding = GetEncodingFromHeaders();

            if (encoding == null)
                encoding = GetEncodingFromBody(data);

            if (encoding == null)
                encoding = defaultEncoding;

            Encoding actualEncoding;
            try
            {
                actualEncoding = Encoding.GetEncoding(encoding);
            }
            catch
            {
                actualEncoding = Encoding.GetEncoding(defaultEncoding);
            }
            return actualEncoding.GetString(data);
        }

        private string GetEncodingFromHeaders()
        {
            string encoding = null;
            string contentType = responseHeaders["content-type"];
            if (contentType != null)
            {
                int i = contentType.IndexOf("charset=");
                if (i != -1)
                {
                    encoding = contentType.Substring(i + 8);
                }
            }
            return encoding;
        }

        private string GetEncodingFromBody(byte[] data)
        {
            string encoding = null;
            string dataAsAscii = Encoding.ASCII.GetString(data);
            if (dataAsAscii != null)
            {
                int i = dataAsAscii.IndexOf("charset=");
                if (i != -1)
                {
                    int j = dataAsAscii.IndexOf("\"", i);
                    if (j != -1)
                    {
                        int k = i + 8;
                        encoding = dataAsAscii.Substring(k, (j - k) + 1);
                        char[] chArray = new char[2] { '>', '"' };
                        encoding = encoding.TrimEnd(chArray);
                    }
                }
            }
            return encoding;
        }

        public void SaveAsFile(string fileName)
        {
            using (BinaryWriter writer = new BinaryWriter(
new FileStream(fileName, FileMode.OpenOrCreate, FileAccess.Write)))
                writer.Write(GetBytes());
        }
    }

    public class WebContext
    {
        private CookieCollection cookies;
        private string referer;

        public CookieCollection Cookies
        {
            get { return cookies; }
            set { cookies = value; }
        }

        public string Referer
        {
            get { return referer; }
            set { referer = value; }
        }
    }

    public enum HttpVerb
    {
        GET,
        POST,
    }

    public class HttpUploadingFile
    {
        private string fileName;
        private string fieldName;
        private byte[] data;

        public string FileName
        {
            get { return fileName; }
            set { fileName = value; }
        }

        public string FieldName
        {
            get { return fieldName; }
            set { fieldName = value; }
        }

        public byte[] Data
        {
            get { return data; }
            set { data = value; }
        }

        public HttpUploadingFile(string fileName, string fieldName)
        {
            this.fileName = fileName;
            this.fieldName = fieldName;
            using (FileStream stream = new FileStream(fileName, FileMode.Open))
            {
                byte[] inBytes = new byte[stream.Length];
                stream.Read(inBytes, 0, inBytes.Length);
                data = inBytes;
            }
        }

        public HttpUploadingFile(byte[] data, string fileName, string fieldName)
        {
            this.data = data;
            this.fileName = fileName;
            this.fieldName = fieldName;
        }
    }
}

相关文章

随机推荐:

相关链接

helloajax.com
专注Ajax、Asp.Net、JavaScript技术