How can I apply an array formula to each value returned by another array formula?

image

I'm using Microsoft 365 and want to find a way in Excel to sequentially apply array formulas - one on the output of the other - to yield a 2D table.

Specifically, I want to apply a regex with capture groups to a user-provided block of text. In an ideal world REGEXEXTRACT() would return a 2D array where each row corresponds to a match, and each column corresponds to a capture group. However, REGEXEXTRACT() currently only returns (using the return_mode argument) either a 1D array containing the capture groups of the first match, or a 1D array containing each match in full (taking no account of capture groups).

So the workaround I planned was to use the second option to output all the matches from REGEXEXTRACT(), then TRANSPOSE() them so they run vertically. A second REGEXEXTRACT() in the next column, applying the same regex, would then operate on each of the matches and output the capture groups horizontally.

Unfortunately, although 1) the first REGEXEXTRACT() works fine 2) the second REGEXEXTRACT() works fine when manually applied to each row output from the first, I haven't been able to find a way to successfully automatically replicate the formula for however many rows there are. If I use the spilled range operator #, although the formula then spills vertically, it doesn't output anything horizontally, meaning only the first capture group is returned.

Here's a simplified example showing the two REGEXEXTRACT() formulas working for the first row:



Here's an example showing the spill not operating both horizontally and vertically:



I'm sure this could be done in VBA but I'd prefer not to go down that route. I feel there's something simple I'm missing - is there?

There is a known (and well justified) limitation of array processing in Excel often described as the "array-of-arrays limitation".

There are many ways to work around this limitation. Perhaps one of the most common is the so-called reduce/vstack pattern.

Broadly speaking, we use REDUCE to iterate over the input array and within its LAMBDA, we use VSTACK to accumulate the results of processing each input row.

Suppose I have your example data and another row (to demonstrate that it will work on multiple rows) in B3:B4, then this formula returns the result:



Since your problem is applying this pattern twice (firstly to process the split lines of a single cell, then to process each cell in the input column), you may be interested in using a LAMBDA function called STACKER which I wrote to apply this pattern in the general case.

After importing the STACKER functions into the Advanced Formula Environment in Excel Labs:



You need only define a function to solve a single row of your input:



And then the formula to solve your problem becomes simpler:



=VSTACKER(SPLITDIMENSIONS)(VSTACKER(SPLITLINES)(B3:B4))

If you'd like to learn more about STACKER, you can review this video and this follow-up video.

The benefit of using STACKER is you needn't remember the syntax for applying the pattern - you just define a function to process one row of your input then wrap it with STACKER to process an array of such inputs.

EDIT:

To process a single line, you can use this formula:

EDIT2:

I just want to note that for this specific problem when it's only 1 cell that needs to be split, the simplest answer is in fact TEXTSPLIT. I posted a separate answer for that.

With Excel 365, you can also use python functions.

In your case it seems to be a much simpler solution:

[cell.split("x") for cell in xl("B2").splitlines()]



You noted in a comment to my other more general answer that you only need to apply this to a single cell. In that case, the easiest way is probably this:

Where "x" is the column delimiter and CHAR(10), which represents a line feed, is the row delimiter.

Ask AI
#1 #2 #3 #4 #5 #6 #7 #8 #9 #10 #11 #12 #13 #14 #15 #16 #17 #18 #19 #20 #21 #22 #23 #24 #25 #26 #27 #28 #29 #30 #31 #32 #33 #34 #35 #36 #37 #38 #39 #40 #41 #42 #43 #44 #45 #46 #47 #48 #49 #50 #51 #52 #53 #54 #55 #56 #57 #58 #59 #60 #61 #62 #63 #64 #65 #66 #67 #68 #69 #70