Pages

Thursday, August 20, 2015

Is it that difficult to read a file from SharePoint site?


Looks like a silly question. But it took me two hours to solve.
- I faced an issue while downloading a file... not from the browser, though.
My requirement was to read the data from an excel file that is stored in a document library. BTW, I was using OpenXML libraries to parse the excel data, which requires either the path of the file or the stream as a parameter. Since it is not possible to provide the path of a document that was stored in a library, I had to pass the stream as the argument.
Exactly here I encountered an issue when I tried to read the file as stream.
The name of the issue is... "NetworkStream", one of the least discussed class.
A Brief History of "NetworkStream":
It inherits from 'System.IO.Stream'. And so the methods derived in it.
However, it denies any seek operation we ask for. In fact this is the first time
I found this class.
        
// The below statement returns the NetworkStream Object
// fileInfoObject is of type - Microsoft.SharePoint.Client.FileInformation
var stream = fileInforObject.Stream;

As per the documentation, it should return a standard stream. But it returns a stream object that doesn't support any Seek operations(Length, Seek ..etc). Read here.. for the info.

In fact I tried couple of methods I found over the internet and found them of no use.
As a result of the experiments, I was left with a corrupted stream object saved to disk as a file that scolded me very badly, in machine language, when I tried to open it. Then, further search awarded me with a way to deal with the file - using the System.Net.WebClient class that provides methods for sending data to and, in my case, receiving data from a resource identified by a URI.

So, it is an easy operation as shown below when we want to get the file as stream from a site that readily authenticates us with the identity used to run the code.
Here, for authentication purpose, we can assign either NetWorkCredentials object or CredentialCache.DefaultCredentials to the WebClient.Credentials;

using (WebClient webClient = new System.Net.WebClient())
{
// excelFileUrl - path/url of excel file that is uploaded to a document library
Stream streamObject = webClient.DownloadData(excelFileUrl)
// now I can do anything with this stream - such as...
// 1. Saving to disk;
// 2. Passing as an argument to OpenXML library method to parse it.
}

It would have been end of this post if my struggle ended here. However, my actual requirement is to read the data from Office 365 SharePoint site. Here the problem is the 'Authentication'. Since our request doesn't carry any authentication tokens/cookies, the server rejects the request saying we don't have access. Here I tried to use my brain but it hardly paid off. The method I tried..


webClientObj.Credentials = sharePointOnlineCredentialsObject;

When I observed the request pattern using fiddler, I found that no FedAuth cookie is associated with the request. Then again google helped me. The solution is to make the WebClient class as Cookie aware. That is, to associate the WebClient with a cookie container that gets added the authentication cookies when we pass valid credentials.
The below method describes how to make WebClient cookie aware. We do a series of steps as..
1. Adding a cookie container to the child class - it helps storing the authentication cookies
2. Overriding the GetWebRequest method - it makes the web request associated with an authentication cookie as a header
3. while overriding, we add the UserAgent string to the web request -

In the absence of the UserAgent string, cookie will not get added to the container. By adding it, our web request mimics a request sent by a browser. This is any valid UserAgent string - need not be the one I've given below.

// Method [1]
// This class extends the capabilities of the WebClient class by adding
// a CookieContainer to it.
class AuthenticatedWebClient : System.Net.WebClient
{
public System.Net.CookieContainer WebClientCookieContainer { get; private set; }
public AuthenticatedWebClient()
{
WebClientCookieContainer = new System.Net.CookieContainer();
}
protected override WebRequest GetWebRequest(Uri url)
{
var request = (HttpWebRequest)base.GetWebRequest(url);
//Adds the existing cookie container to the Request
request.CookieContainer = WebClientCookieContainer;
request.UserAgent = "Mozilla/5.0 (Windows NT 6.0; rv:12.0) Gecko/20100101 Firefox/12.0";
return request;
}
}

Now the downloading part - The below snippet tells how to pass credentials along with the Cookie aware WebClient class. We pass credentials, using the NaveValueCollection object, to the login url of the Office 365 site.

using (AuthenticatedWebClient authenticatedWebClient = new AuthenticatedWebClient())
{
System.IO.Stream filestream = null;
var values = new NameValueCollection{{ "username", "saratchandra@myorganization.com" },{ "password", "mypassword" }};
authenticatedWebClient.UploadValues("http://ift.tt/1sRvdjz", values);
string url = "http://ift.tt/1EGJcQC";
byte[] content = authenticatedWebClient.DownloadData(url);
filestream = new System.IO.MemoryStream(content);
// The below code is a custom helper method to parse the excel data
OpenXmlExcelHelper.GetExcelData(filestream);
}

With the method explained above, everything seemed to be under control. However, the result again was an incomplete chunk of byte stream - which again is not useful.

This time I really had to use my brains and...this time it worked.. The below is the code snippet for that. A simple way of making a WebClient cookie aware without struggling much.
Here, the `credentials` object is of type - SharePointOnlineCredentials

// Method [2]
using (WebClient webClient = new System.Net.WebClient())
{
webClient.Headers.Add("Cookie", credentials.GetAuthenticationCookie(new Uri("http://ift.tt/1JuEq08")));
string url = "http://ift.tt/1EGJcQC";
byte[] content = webClient.DownloadData(url);
filestream = new System.IO.MemoryStream(content);
// custom method to parse excel data
OpenXmlExcelHelper.GetExcelData(filestream);
}
Original Post

by Saratchandra Peddinti via Everyone's Blog Posts - SharePoint Community

No comments:

Post a Comment