BigW Consortium Gitlab

  1. 03 Aug, 2016 1 commit
    • Improve AutolinkFilter#text_parse performance · dd35c3dd
      Yorick Peterse authored
      By using clever XPath queries we can quite significantly improve the
      performance of this method. The actual improvement depends a bit on the
      amount of links used but in my tests the new implementation is usually
      around 8 times faster than the old one. This was measured using the
      following benchmark:
      
          require 'benchmark/ips'
      
          text = '<p>' + Note.select("string_agg(note, '') AS note").limit(50).take[:note] + '</p>'
          document = Nokogiri::HTML.fragment(text)
          filter = Banzai::Filter::AutolinkFilter.new(document, autolink: true)
      
          puts "Input size: #{(text.bytesize.to_f / 1024 / 1024).round(2)} MB"
      
          filter.rinku_parse
      
          Benchmark.ips(time: 15) do |bench|
            bench.report 'text_parse' do
              filter.text_parse
            end
      
            bench.report 'text_parse_fast' do
              filter.text_parse_fast
            end
      
            bench.compare!
          end
      
      Here the "text_parse_fast" method is the new implementation and
      "text_parse" the old one. The input size was around 180 MB. Running this
      benchmark outputs the following:
      
          Input size: 181.16 MB
          Calculating -------------------------------------
                    text_parse     1.000  i/100ms
               text_parse_fast     9.000  i/100ms
          -------------------------------------------------
                    text_parse     13.021  (±15.4%) i/s -    188.000
               text_parse_fast    112.741  (± 3.5%) i/s -      1.692k
      
          Comparison:
               text_parse_fast:      112.7 i/s
                    text_parse:       13.0 i/s - 8.66x slower
      
      Again the production timings may (and most likely will) vary depending
      on the input being processed.
  2. 02 Aug, 2016 16 commits
  3. 01 Aug, 2016 23 commits