Forums Forums Search & Filter Pro Gracefully import 30000 posts

Viewing 10 posts - 1 through 10 (of 26 total)
  • Trevor
    #58259

    Please accept some caveats before I reply:

    It is late here (2 hours past closing) and one should never reply when tired 😉
    30,000 blog posts is enormous. I doubt that we have tested it under that sort of load.
    Any import process effectively runs a loop. If you have access to the code, adding a small block of code at the end of each loop might help. See here.

    The resources you have available will be key. A PHP timeout error such as you see means that PHP is set to a fairly low and safe limit. This is so with PHP out of the box, because most hosts like it that way to preserve balance on their servers. A blog with that many posts ought to be on some sort of dedicated resource system. I am unsure how big though.

    What hosting is it on right now? Shared, reseller, VPS, Cloud, Dedicated ….. (the last three will have contractually available resource lists, so if it is one of them, what do you get for your money?

    Anonymous
    #58276

    Hi Trevor,

    Thanks for responding after hours.

    Regarding your suggestion to add individual post to the cache after each iteration: that seems like exactly what we did in APPROACH#2. Please refer my original post to see issues that we ran into with that.

    Our blog is hosted in AWS cloud on a t2.small instance.

    If we potentially switch to more resourceful server, which PHP settings do you recommend to change?

    Trevor
    #58287
    This reply has been marked as private.
    Anonymous
    #58337
    This reply has been marked as private.
    Trevor
    #58360

    I am in the UK, so unless you are too, telephone may not be cost effective! I am not quite sure what you meant about Skype? You can’t use it outside of the work network, or can’t use it on the corporate network? Why not from home (or on a tablet with mic/headset in a Starbucks)?

    Anonymous
    #59386

    Hello again,

    So we were able to significantly lower the page load time by disabling Auto Count setting in S&F.
    For 30 000 posts the load time of a page with S&F widget went down from 6 minutes (!!) to 7 seconds.

    As per our our conversation, we are considering chunking our posts in smaller groups. We ran some performance tests for 5 000 posts with Auto Count disabled. Here are the results:
    Load page with S&F widget ~3.6 sec
    Perform a search ~27 seconds

    While this timings are looking much better, 27 seconds is way too slow for searching 5 thousand posts.
    Is there anything else we are missing?

    Some info about our environment:

    Physical server: AWS t2.small; 2 GB memory; CPU 1 core up to 3.3 GHz
    OS: Windows Server 2012
    Webserver: IIS 8.5
    PHP 5.6.22
    Wordpress 4.6.1
    S&F Pro 4.6.1
    MySQL 5.6.27

    php max_memory_limit 256M;
    WP_MEMORY_LIMIT 256M;
    WP_MAX_MEMORY_LIMIT 256M;

    Snippet from S&F System Status:
    Database
    Table wp_search_filter_cache Exists: ✔ 204875 Rows Found
    Table wp_search_filter_term_results Exists: ✔ 9916 Rows Found
    Total Number of Fields Cached: 4

    Anonymous
    #59404

    UPDATE:
    We use “Search Everything” plugin to control which content should be searchable. Looking closer at its configuration we disabled “Search every custom field” setting because our custom fields are already searchable (thanks to S&F?). That decreased the search time drastically.

    5000 posts
    Load page with S&F widget: ~4 sec
    Perform search: ~8 sec

    30000 posts
    Load page with S&F widget: ~7 sec
    Perform search: ~18 sec

    Our blog is usable again, although these timings are still far from a great user experience. Any ideas on how to further improve search performance are much appreciated.

    Ross Moderator
    #59534

    Hey Rob

    I’m looking at performance at the moment. You’re definitely at the higher end of usage for the plugins capabilites – I’ve tested regularly with 4k posts & autocount enabled, but speed is often affected by how many fields & options you have.

    Do you have any fields with a considerable amount of options (+100)?

    Thanks

    Anonymous
    #59756

    Hey Ross,

    Yes, we have 4 custom fields with the options count varying from few hundred to few thousand.

    We do need all of them be available though. Is there a more efficient way to handle such fields?

    Anonymous
    #60371

    UPDATE
    I did some more research and observed few long running SQL queries that can potentially be improved.
    These get executed every with every search:

    1. Searching postmeta

    SELECT SQL_CALC_FOUND_ROWS distinct wp_posts.ID
    FROM   wp_posts
           LEFT JOIN wp_postmeta
                  ON wp_posts.id = wp_postmeta.post_id
           LEFT JOIN wp_term_relationships AS trel
                  ON ( wp_posts.id = trel.object_id )
           LEFT JOIN wp_term_taxonomy AS ttax
                  ON ( ( ttax.taxonomy = 'post_tag' )
                       AND trel.term_taxonomy_id = ttax.term_taxonomy_id )
           LEFT JOIN wp_terms AS tter
                  ON ( ttax.term_id = tter.term_id )
           LEFT JOIN wp_comments AS cmt
                  ON ( cmt.comment_post_id = wp_posts.id )
           LEFT JOIN wp_users AS u
                  ON ( wp_posts.post_author = u.id )
    WHERE  1 = 1
           AND ( ( wp_posts.id IN (<LIST_OF_POST_IDS>)
                   AND (( (( ( wp_posts.post_title LIKE '%searchterm%' )
                              OR ( wp_postmeta.meta_value LIKE '%searchterm%' )
                              OR ( wp_posts.post_content LIKE '%searchterm%' ) ))
                           OR (( tter.name LIKE '%searchterm%' ))
                           OR ( (( cmt.comment_content LIKE '%searchterm%' ))
                                AND cmt.comment_approved = '1' )
                           OR (( u.display_name LIKE '%searchterm%' )) ))
                   AND wp_posts.post_type = 'generalpost'
                   AND (( wp_posts.post_status = 'publish' )) )
                 AND post_type != 'revision' )
           AND post_status != 'future'
    ORDER  BY wp_posts.post_date DESC
    LIMIT  0, 15;

    This query is the biggest troublemaker and accounts for about 65% of our search time.
    How to improve: We have 18 post meta records for each post of our custom post type.
    However, we only need 4 of them to be searchable. So when I changed the query to look something like this:

    SELECT SQL_CALC_FOUND_ROWS distinct wp_posts.ID
    FROM   wp_posts
           LEFT JOIN wp_postmeta
                  ON wp_posts.id = wp_postmeta.post_id
           LEFT JOIN wp_term_relationships AS trel
                  ON ( wp_posts.id = trel.object_id )
           LEFT JOIN wp_term_taxonomy AS ttax
                  ON ( ( ttax.taxonomy = 'post_tag' )
                       AND trel.term_taxonomy_id = ttax.term_taxonomy_id )
           LEFT JOIN wp_terms AS tter
                  ON ( ttax.term_id = tter.term_id )
           LEFT JOIN wp_comments AS cmt
                  ON ( cmt.comment_post_id = wp_posts.id )
           LEFT JOIN wp_users AS u
                  ON ( wp_posts.post_author = u.id )
    WHERE  1 = 1
            <em>AND
            wp_postmeta.meta_key in ('custom_field1',
                                    'custom_field2',
                                    'custom_field3',
                                    'custom_field4')</em> 
           AND ( ( wp_posts.id IN (<LIST_OF_POST_IDS>)
                   AND (( (( ( wp_posts.post_title LIKE '%searchterm%' )
                              OR ( wp_postmeta.meta_value LIKE '%searchterm%' )
                              OR ( wp_posts.post_content LIKE '%searchterm%' ) ))
                           OR (( tter.name LIKE '%searchterm%' ))
                           OR ( (( cmt.comment_content LIKE '%searchterm%' ))
                                AND cmt.comment_approved = '1' )
                           OR (( u.display_name LIKE '%searchterm%' )) ))
                   AND wp_posts.post_type = 'generalpost'
                   AND (( wp_posts.post_status = 'publish' )) )
                 AND post_type != 'revision' )
           AND post_status != 'future'
    ORDER  BY wp_posts.post_date DESC
    LIMIT  0, 15;

    it ran 2 times faster. I didn’t find an option to exclude certain post meta fields from search in the plugin settings. If there is a way to do that, please point me.

    2. Queries to search and filter cache

    SELECT DISTINCT field_value
        FROM wp_search_filter_cache
        WHERE field_name = '_sfm_customfield1'
    SELECT DISTINCT field_value
        FROM wp_search_filter_cache
        WHERE field_name = '_sfm_customfield2'
    SELECT DISTINCT field_value
        FROM wp_search_filter_cache
        WHERE field_name = '_sfm_customfield3'
    SELECT DISTINCT post_id 
        FROM wp_search_filter_cache
    SELECT DISTINCT field_value_num as field_value
        FROM wp_search_filter_cache
        WHERE field_name = '_sft_post_tag'

    These are pretty lengthy too and sum up to ~2 seconds. I don’t see a purpose of executing these every time we search though. Could you please comment on this?

    Thank you

Viewing 10 posts - 1 through 10 (of 26 total)