Wednesday, September 20, 2017

Microsoft Flow: Updating File Metadata When You Have More Than 5000 Items

It’s been way too long since I did a blog post. Let’s remedy that, shall we?

Paul Culmsee (@paulculmsee) has been doing some great work with Microsoft Flow and PowerApps. His daughter Ashlee even won a contest put on by the PowerApps team to make a fidget spinner app! When Paul digs into a technology, stand back, because he beats it into submission like few others. If there is a Clever Workaround – and we need those way too often – he’ll find it.

Paul did a post back in August I’ve referred to multiple times as I’ve tried to use Microsoft Flow. My approach with Flow here is that I try to use it for anything new I need to automate in Office 365. If I hit a roadblock, then I revert to SharePoint Designer workflows. The good news is that I’m finding I revert less and less, though it still happens maybe a third of the time at this point. (I’m guessing, but that’s about what it feels like. I would love it to be 0%!) Unfortunately, that means rework down the road if we want to be fully “modern”.

Paul’s article The (currently) best way to update SharePoint document metadata using Flow shows us the unfortunately byzantine way we need to do things in order to update metadata on a file we’ve just created. The fact that we don’t get a useful “handle” back to the file we just created is sorta crazy, and the Microsoft Flow team knows this. In the meantime, Paul’s approach works. Well, it works unless you have more than 5000 items in a library.

I had exactly that situation. At one of my clients, we have an automated process which dumps files into a library I’ve inventively called __Temp. When a new file shows up, I run a Flow to copy the file into a place where we want to store it and apply metadata to the copied file. My problem arose when I tried Paul’s second step. Because my library has way over 5000 items, that step failed.

So, until we are able to get the ID or some other useful handle for the newly created file, I needed to come up with a cleaver workaround on top of Paul’s clever workaround.

The trick was to do the following:

  • Add an index to the Modified column in the destination list.
  • Set up a TimeStamp variable in my Flow to capture the startOfDay() when the Flow is running
  • In the GetItems step, use a filter based on the Modified column, and *then* the FileLeafRef to find the file I just created

Here it is in pictures…

Make sure you have an index on the Modified column in the target list. In my case, I have indices on several other columns as well.

In the flow, set up a variable to hold

startOfDay(utcnow())
 . You set this up in the Expression Builder. I initially tried to set the variable just before I created the new file, setting it to
utcnow()
 . In theory, this should have worked, but I found I had about a 50% failure rate. This doesn’t make a lot of sense, but to be safe, I switched to
startOfDay()
 , which has proved reliable.

Just like Paul’s example, I then create the new file.

In the GetItems step, I use the more complex filter: 

Modified gt datetime'TimeStamp​​' and FileLeafRef eq 'Name'
Because Modified is an indexed column, and it is the first filter in my expression, the result set is reduced to fewer than 5000 items and the query works.

Now everything works reliably. I can’t wait to reduce the complexity of this Flow when we get a handle to the newly created file in the Create File step!

Aren’t you proud of me? I wrote this whole post without whining about the 5000 item limit.


by Marc D Anderson via Marc D Anderson's Blog

No comments:

Post a Comment