Notice: FilmLA, has updated and reorganized its website. In the process, many of our links to their site were broken. Most have been repaired by referencing archived copies stored at Please let us know if you encounter a broken link.

June 23, 2018

The Data Chase (part 3): "The scrape"

——Continued from The Data Chase (part 2)——

The good folks a Jassy, Vick and Carolan (JVC ) could no long support us.  In our last communication, JVC  speculated that a backup of the database for the Online Permit System (OPS) could be a public record.  There was little doubt the back up existed and it would hold the data in an electronic form we could use.

We drafted another CPRA request, but were reluctant to proceed without a review from an expert.  Once again we were shopping for pro bono help.

We revisited the First Amendment Coalition (FAC) website for references to other firms with CPRA expertise. We sent an inquiry to every firm they listed in the LA area. Our only reply was a polite reminder from an attorney who had already turned us down. We had hit a dead end.

Then our luck changed.

I was having beers at Der Wolfskopf in Pasadena with an old friend who had given up engineering for law and community activism. I was rattling on about the state of our data collection project. He listened carefully and made a referral. He had seen an attorney make a dazzling, open-records appeal to the Orange County Board of Supervisors. Her name was Kelly Aviles.

We sent Ms. Aviles an inquiry. Her response was immediate; she was willing to hear more. We arranged a call. She was inspiring. Not only did she have confidence in the legitimacy of our request, she was quick to grasp the essentials, convincing in reply, and firm in her analysis. She agreed to review our letter.

On February 15, we sent our third,  CPRA request to FilmL.A.  This one was strongly worded.  We were asking for the "electronic data in the format in which it is held, in this case, the raw data held in the Online Permit System database."

We received the following reply dated February 28th:
We have received and reviewed your latest CPRA request with legal counsel and reiterate that, other than where contractually obligated to treat final permits as public records, Film L.A., Inc. is a private organization and not subject to the California Public Records Act. Our technology systems (Online Permit System) is proprietary and recognized as such in our municipal contract. Data within the system to fulfill our obligation to generate the final permit (all of which would occur during a "draft" stage or stages) may or may not become part of the final released or rejected permit. This data is not, at any rate, the final permit which would be recognized as the public record in our possession and for which we are required to respond to CPRA requests. Our municipal clients have only limited access to OPS through a communications link and not the database or its data. In addition OPS includes information that is internal to the organization and not our municipal clients. The public records we hold (final permits) are available in electronic format and access to those records has been provided to you.b

Excerpt from letter signed by Paul Audley, President of FilmL.A.

Audley's offer of access to the OPS system was as unappealing as ever, but this reply, more than the others, seemed to be laying a legal groundwork. The CPRA stipulates that records with proprietary information are not a public record. It also says that records created during a contract development period is not a public record until the contract is awarded.  Audley seemed to be suggesting that a draft permit was not a public record because it was a contract in development. In other words, if the database held proprietary information or if permits were like contracts, then the OPS database might not be a public record and FilmL.A. might have a basis for a denial.

Were his assertions true?  Were they grounds for denial?

The argument about draft permits as contracts seemed easily overcome. We need only to request a backup version dated a few months back so that any draft permits would not longer be "in development."

Audley's claim that FilmL.A. was not subject to the CPRA because it was private and  because the database held proprietary data might be a dubious.   In an early email, a FAC attorney had advised us that if "...FilmLA was created by the city and county to perform the government function of approving filming permits...then FilmLA should be subject to the CPRA and would have to comply with a records request..." In fact FilmL.A. was created by LA City and LA County for the purpose of replacing services previously provided by City and County Departments. In fact nearly all their operating funds are derived from film permit fees, film field services and school film licenses — all revenues previously charged and collected by local government.

We consulted with Aviles. She was dismissive of Audley's claims and confident that, in the end, our request would succeed. But, she drew the same conclusion that JVC had drawn: FilmL.A. would not provide the records without a writ from the Court. There was no point in continuing to send our informal CPRA letters. To proceed, we would need to raise funds for legal fees and court costs.

We requested feedback from a group interested friends and colleagues who had been following our progress. Did they think we should initiate a GoFundMe initiative to raise the needed funds? The response was very positive. So positive in fact, we had secured enough pledges to pay for the first legal step before we even opened a GoFundMe account.

That was a tipping point.

Before committing to an expensive legal process, we owed to ourselves to take a closer look at the approach Audley had repeatedly offered. Perhaps we could we get the data from the OPS system.  Perhaps it was realistic to capture, or scrape, all the documents from the OPS system and extract the data from them.

I contacted a few old software development colleagues for some pointers. The download could be automated. The software we needed was free and in widespread use by web developers. The download process could be run on no-cost services available from Amazon and Google. The decoding would require some programming. The programming tools were free.

In February I did some investigation and ran some preliminary tests.  FilmL.A. permits have a regular structure; there was a decent chance decoding would work. I timed the download process; the OPS system was slow and the system was restarted every night.  We could down load from four to a dozen documents a minute.  There were about 180,000 documents on the site. The scrape was possible, but would would take a month. Programming the decoder would take a couple weeks. Fixing bugs and correcting for anomalies would take a few more weeks. The project to scrape the data from the OPS system seemed feasible.

I discussed the project with Steve. He agreed that we should try the scrape before starting the legal process.

I began the Scrape Project during the last week of February.   My programming estimate was off by half, but we had a credible data set in early May. After all the hubbub, we  had what we had requested: the data we needed to estimate the untaxed income from location-filming rentals in the Los Angeles region.

Here's a few details about the data we collected in Scrape Project:
  • Jurisdictions covered: City of LA, LA County, plus 15 municipalities in the regionthat have contracted for FilmL.A.'s services
  • Jurisdictions not covered: Burbank, Glendale, Pasadena, plus numerous other municipalities in the region
  • Years covered in the data: Q2 2008 through Q1 2018
  • Total number of permits downloaded and decoded: 180,435
  • Total number of times locations were approved for filming: 335,713
  • Total number of permitted on-location, production days including prep, filming and strike: 913,2951
  • Each data record includes the following values:
    • Permit number
    • Production company
    • Production title
    • Location manager
    • Jurisdiction
    • Prep dates
    • Film dates
    • Strike dates
    • Total production days (sum prep, film and strike days)
    • Year
    • Permit type (commercial, feature, still, etc.)
    • Location address
    • Zip code
    • Location type (e.g. private property, park, parking lot, etc.)
In the coming weeks, we plan to make the data set publicly available. Similarly, we will make the permit decoder source code available.

From the start, our effort to obtain and analyze filming permit data was motivated by a desire to understand the amount of revenue that passed through an exception in the IRS and FTB tax codes. It was our thought that, if the amount was sufficiently large, a local tax might generate funds that would provide significant benefit to underfunded government programs that assist the disadvantaged.

An initial estimate, based on filming in Altadena, appeared to suggest that the untaxed revenues in the LA Region might be as much as $200 million per year. It was thought that if the estimate was accurate, a tax rate comparable to the Transit Occupancy Tax might generate as much as $20 million dollars for underfunded local programs.

We used the scraped data to develop a more credible estimate. Based on that data, it appears the estimate of $200 million per year was much too high. A more reasonable estimate for tax-free, film-rental income in the LA region appear to be between $50-60 million per year. Assuming a transit-occupancy tax rate, the estimated tax proceeds would be $5-6 million per year. Well short of the estimated $20 million per year that originally motivated this study.2

In retrospect, we should have started the Scrape Project months before we did. We did not need to occupy the energies of those public spirited attorneys nor the people of FilmL.A, although we would have hoped that a public agency would be helpful. However, we now have the data and it speaks for itself.

1 The number was not scrubbed for errors in the original permit.  For example, in a dozen or so cases, the original permit may show a start date in 2015 and end date in 2016 adding 300 days to the total.  Consequently, the number of production days may be inflated by a few thousand or less than 0.10%.
2 A informal report that explains the basis of this estimate is available on request.

No comments:

Post a Comment